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Series preface 


The Mouton-NINJAL Library of Linguistics (MNLL) series is a new collaboration 
between De Gruyter Mouton and NINJAL (National Institute for Japanese Language 
and Linguistics), following the successful twelve-volume series Mouton Handbooks 
of Japanese Language and Linguistics. This new series publishes research mono- 
graphs as well as edited volumes from symposia organized by scholars affiliated 
with NINJAL. Every symposium is organized around a pressing issue in linguis- 
tics. Each volume presents cutting-edge perspectives on topics of central interest 
in the field. This is the first series of scholarly monographs to publish in English on 
Japanese and Ryukyuan linguistics and related fields. 

NINJAL was first established in 1948 as a comprehensive research organiza- 
tion for Japanese. After a period as an independent administrative agency, it was 
re-established in 2010 as the sixth organization of the Inter-University Research 
Institute Corporation *National Institutes for the Humanities". As an international 
hub for research on Japanese language, linguistics, and Japanese language educa- 
tion, NINJAL aims to illuminate all aspects of the Japanese and Ryukyuan languages 
by conducting large-scale collaborative research projects with scholars in Japan 
and abroad. Moreover, NINJAL also aims to make the outcome of the collaborative 
research widely accessible to scholars around the world. The MNLL series has been 
launched to achieve this second goal. 

The authors and editors ofthe volumes in the series are not limited to the schol- 
ars who work at NINJAL but include invited professors and other scholars involved 
in the collaborative research projects. Their common goal is to disseminate their 
research results widely to scholars around the world. 

This is the second volume originating from an international conference jointly 
held by Tohoku University and NINJAL, featuring a collection of papers on psycho- 
linguistics related to Japanese from comparative perspectives. It aims to bridge the 
gap between theoretical and psycholinguistic studies in the field, covering language 
production and comprehension by children, patients with aphasia, individuals 
with autism spectrum disorder, as well as typically developed adult speakers. It will 
provide both students and experts with essential information for their research 
and insights into the current state-of-the-art in their respective subfields. 


Yukinori Takubo 
Haruo Kubozono 
Yo Matsumoto 
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Preface 


Issues in Japanese Psycholinguistics from Comparative Perspectives came out of the 
International Symposium on Issues in Japanese Psycholinguistics from Compara- 
tive Perspectives (IJPCP) held online in September 2021. IJPCP consisted of twen- 
ty-nine papers in ten sessions over two days. It was jointly organized by the JSPS 
Grant-in-Aid for Scientific Research (S) Project *Field-Based Cognitive Neuroscien- 
tific Study of Word Order in Language and Order of Thinking from the OS Language 
Perspective" and NINJAL (National Institute for Japanese Language and Linguis- 
tics) Collaborative Research Project *Cross-linguistic Studies of Japanese Prosody 
and Grammar" and cosponsored by the Advanced Institute of Yotta Informatics (AI 
Yotta), Tohoku University, Japan. 

Issues in Japanese Psycholinguistics from Comparative Perspectives is in two 
volumes: Cross-Linguistic Studies (Volume 1) and Interaction Between Linguistic and 
Nonlinguistic Factors (Volume 2). The two volumes combined together include 27 
papers that were all presented at the conference except for two papers by Takuya 
Kubo and Jungho Kim, respectively, who were unable to attend the symposium. AII 
the papers went through peer review, and I would like to thank those who kindly 
acted as inside or outside reviewers. 

In organizing the international symposium and editing the volumes, I received 
invaluable assistance from numerous people. First and foremost, I am grateful 
to Yukinori Takubo (former Director-General of NINJAL) and Haruo Kubozono 
(former Deputy Director-General of NINJAL) for their continuous support that 
made this project possible. Sachiko Kiyama, Kexin Xiong, Maho Morimoto, Misato 
Ido, Min Wang, Ge Song, Lega Cheng, and Rei Emura helped organize the confer- 
ence. Thanks are also due to Michaela Góbels and De Gruyter Mouton for their 
support. The conference and the editing ofthe volumes were funded by NINJAL and 
JSPS KAKENHI Grant Number 19H05589. 
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Chapter 1 

Japanese Psycholinguistics from 
Comparative Perspectives: Interaction 
Between Linguistic and Nonlinguistic Factors 


1 Introduction 


Issues in Japanese Psycholinguistics from Comparative Perspectives comprises two 
volumes compiling 27 state-of-the-art articles on Japanese psycholinguistics and 
related topics. It emphasizes the importance of using comparative perspectives 
when conducting psycholinguistic research. 

Psycholinguistic studies of the Japanese language have contributed greatly 
to the field from a cross-linguistic perspective. However, the target languages for 
comparison have been limited. Most research focuses on English and a few other 
typologically similar languages, which are nominative-accusative and subject-be- 
fore-object languages, as is Japanese. Thus, many current theories fail to acknowl- 
edge the nature of ergative-absolutive or object-before-subject languages and treat 
the nature of nominative-accusative subject-before-object languages as universal 
to human language. Therefore, a detailed consideration of the language processing 
stages of more diverse languages (in addition to familiar languages), relative to 
Japanese, is essential to clarify the universality and diversity of human language 
and to correctly situate Japanese among languages worldwide. 

Beyond the cross-linguistic approach, other prominent methods of compar- 
ison in psycholinguistics include comprehension versus production, prosodic 
versus syntactic processing, syntactic versus semantic processing, semantic 
versus pragmatic processing, native speakers versus second language learners, 
typical development versus development of language by people with autism 
spectrum disorder, typical versus aphasic language development, language 
versus action, and language versus memory. Comparative studies have proven 
to be fruitful in revealing the nature of various components of human cognition 
and how they interact. Many such approaches are underrepresented in Japanese 
psycholinguistics. 
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The studies reported in the two volumes attempt to bridge these gaps. Using 
various experimental and computational techniques, they address issues of the 
universality and diversity of the human language and the nature of the relation- 
ship between human cognitive modules. Special reference is made to the mecha- 
nisms by which the brain processes and represents languages. 


2 Outline of Volume 2 


Volume 2 contains 13 papers, all related to interactions between linguistic and non- 
linguistic factors. In Chapter 2, *High sense of agency versus low sense of agency 
in event framing in Japanese," Manami Sato, Keiyu Niikuni, and Amy J. Schafer 
investigate how the language-user-internal factor of “Sense of Agency” (SoA, Moore 
2016) influences the language users' framing of action events and selection of per- 
spective, as measured by native speakers’ responses to active and passive voice 
in Japanese. The effects of temporarily manipulating SoA (Experiment 1) and an 
individual's intrinsic differences in SoA (Experiment 2) were tested using two 
picture-word verification experiments. The results suggest that physical motion 
shifts language users’ SoA, and SoA affects language users’ event framing: high SoA 
enhances an agent perspective, while low SoA enhances a patient perspective in 
transitive event comprehension. 

Chapters 3 and 4 examine the interaction between linguistic complexity and 
working memory. In Chapter 3, “Locality-based retrieval effects are dependent 
on dependency type: A case study of a negative polarity dependency in Japa- 
nese,” Kentaro Nakatani hypothesizes that the dependency type interacts with the 
use of working memory in processing sentences. Specifically, the author assumes 
non-thematic dependencies are more prone to activation decay than thematic 
dependencies—yielding locality effects—because they are often linearly discontin- 
uous. Although the two self-paced reading experiments manipulating the distance 
between a negative polarity item sika and its licenser Neg did not directly support 
this hypothesis, a significant interaction between the locality effects and partici- 
pants’ comprehension performance was identified in that good readers tended 
to exhibit stronger locality effects. This finding accords with previous findings by 
Nicenboim et al. (2016), who reported a correlation between locality effects and 
working memory capacity. 

In Chapter 4, “An EEG analysis of long-distance scrambling in Japanese: Head- 
direction, reanalysis, and working memory constraints,” Shingo Tokimoto and 
Naoko Tokiomoto examine the processing of the discontinuous dependency in Jap- 
anese complex sentences using event-related potentials (ERPs), paying close atten- 
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tion to the head-direction difference between English and Japanese. The authors 
manipulated a possible syntactic island in Japanese by long-distance scrambling 
and observed four ERPs: anterior negativity, parietal positivity, occipital negativity, 
and late parietal positivity. They consistently interpreted these ERPs as manifesta- 
tions of the additional working memory load, reanalysis, verb-arguments thematic 
correspondence, and detection of a syntactic anomaly. 

The subsequent five chapters commonly address the flexible nature of the 
human parser. In Chapter 5, *The time course of SOV and OSV sentence process- 
ing in Japanese," Katsuo Tamaoka considers five questions that examine the pro- 
cessing mechanism of subject-object-verb (SOV)- and object-subject-verb (OSV)-or- 
dered transitive sentences in the head (verb)-final language of Japanese. (1) Why 
do OSV-scrambled sentences require more processing time than canonical SOV 
sentences? (2) Does longer-distance scrambling require longer processing times 
than shorter-distance scrambling? (3) Are head (verb)-final languages disadvan- 
tageous for sentence processing? (4) What function does a finally positioned verb 
have in sentence processing? (5) How does the nature of topicalization affect sen- 
tence processing? The answers to these questions indicate that OSV-scrambled 
sentences are processed by gap-filling parsing. Although pre-head anticipatory 
processing functions before a final verb are observed, argument information is 
also required, especially for a scrambled sentence. Subject-topicalized sentences 
in the same order as canonical SOV sentences may be interpreted as being in the 
canonical order. However, object topicalization may involve double movements of 
scrambling and topicalization, requiring even longer processing time than for a 
single movement forming an OSV-scrambled order. Given these properties in Jap- 
anese, further research into the effects of topicalization should be conducted on a 
head (verb)-initial language in which the topicalized order does not overlap with 
either of the aforementioned orders. 

In Chapter 6, *Sentence processing cost caused by word order and context: 
Some considerations regarding the functional significance of P600," Daichi Yasu- 
naga observes how the appearance of the P600 component changes in Japanese 
subject-object (SO) word order versus object-subject (OS) word order per whether 
line drawings are given as the context. The experimental results revealed that OS 
word order has a higher processing cost than SO word order and generates P600. 
Furthermore, as per the presence of context, there were differences in their scalp 
distribution and the region (bunsetsu) in which they appeared. It suggests that 
the P600 is an ERP component that reflects a syntactic processing load and a more 
general cognitive load. 

In Chapter 7, *The adaptive nature of language comprehension," Masataka 
Yano examines the cognitive mechanisms underlying the adaptation achieved 
during language comprehension. Yano addresses the following two issues through 
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a series of event-related potential experiments. First, linguistic adaptation is expec- 
tation-based. Native speakers of Japanese can adjust their expectations for sen- 
tences that lack licit syntactic representation but are predictable. Furthermore, 
a strong prediction error serves as a trigger for adaptation. These findings rule 
out the hypothesis that accounts for linguistic adaptation in terms of the residual 
or baseline activation boosts of previously processed representations. Second, lin- 
guistic adaptation is selective and rational. Morphosyntactic and aspectual viola- 
tions induce adaptive behavior, whereas there was no evidence for adaptation to 
semantic violations. This rational adaptation in comprehension may indicate an 
underlying cooperative alignment in language communication. 

In Chapter 8, “(Dis)similarities between semantically transparent and lexical- 
ized nominal suffixation in Japanese: An ERP study using a masked priming par- 
adigm,” Jun Nakajima and Shinri Ohta investigate the lexical processing of mor- 
phologically complex words. The Japanese language has two types of de-adjectival 
nouns: -sa and -mi nouns. The former is productive and semantically transpar- 
ent, while the latter is unproductive and has a lexicalized meaning. In this elec- 
troencephalographic study, the authors examine how these nouns are processed 
in the brain. Using a masked priming paradigm, they demonstrated similarities 
in priming effects on the N400 and laterality effects on the N170 for -sa and -mi 
nouns. They also found dissimilarities between -sa and -mi nouns: a larger N400 
and lower behavioral performance for -mi nouns given their lexicalized meaning. 
Using linear mixed-effects models, they determined that the transition probability 
from stem to suffix attenuated the N400 in the temporoparietal regions. Moreover, 
both de-adjectival noun types exhibited significant priming effects in the behavio- 
ral data; that is, shorter reaction times and lower error rates under the related con- 
dition. The results suggest (dis)similarities in the neural mechanisms for processing 
two types of de-adjectival nouns in Japanese. 

In Chapter 9, *Brain mechanisms for the processing of Japanese subject-mark- 
ing particles wa, ga, and no," Toshiki Iwabuchi, Satoshi Nambu, Kentaro Nakatani, 
and Michiru Makuuchi report on two functional magnetic resonance imaging 
experiments conducted to investigate the brain mechanisms underlying the pro- 
cessing of Japanese subject-marking syntactic particles, wa, ga, and no. Experiment 
1revealed that, relative to the nominative marker ga, the topic marker wa induced 
higher activity in cortical regions associated with syntactic structure building. 
Experiment 2 compared no-marked genitive subjects with ga-marked nominative 
subjects. Brain regions related to syntactic reanalysis displayed increased activity 
for the genitive subjects, which is less frequent than the nominative subjects, but 
the effect was significant only in the early period of the experiment. These results 
suggest that distinct Japanese subject-marking particles drive different neural sub- 
systems of syntactic processing. 


Chapter 1 Japanese Psycholinguistics from Comparative Perspectives — 5 


Chapters 10 and 11 shed light on language use by atypical populations. In 
Chapter 10, *Pragmatic atypicality of individuals with autism spectrum disorder: 
Preliminary results of a production study of sentence-final particles in Japanese," 
Taiga Naoe, Tsukasa Okimura, Toshiki Iwabuchi, Sachiko Kiyama, and Michiru 
Makuuchi consider the pragmatic atypicality of individuals with autism spectrum 
disorder (ASD). ASD is a neurodevelopmental disorder exhibiting atypicality in the 
pragmatic aspects of language. Some case studies reported that individuals with 
ASD seldom use sentence-final particles (SFPs), which are bound morphemes rep- 
resenting the speaker's attitudes and moods; however, there is a need for more 
empirical evidence to verify this tendency. The authors compared the use of the 
SFPs, -ne and -yo, in the same context between Japanese-speaking adults with ASD 
and typically developed (TD) adults through an oral discourse completion task. 
The results revealed that adults with ASD used -ne less frequently and -yo in inap- 
propriate contexts more frequently than TD adults. 

In Chapter 11, “Auditory comprehension of Japanese scrambled sentences 
by patients with aphasia: An ERP study," Michiyo Kasai, Sachiko Kiyama, Keiyu 
Niikuni, Shingo Tokimoto, Liya Cheng, Min Wang, Ge Song, Kohei Todate, Hide- 
toshi Suzuki, Shunji Mugikura, Takashi Ueno, and Masatoshi Koizumi investigate 
the processing load of Japanese sentences that are semantically reversible with the 
help of canonical and scrambled word orders in patients with two common types of 
aphasia: Broca's and Wernicke's aphasia. Like their healthy counterparts, patients 
with Broca's aphasia revealed an ERP-P600 effect for processing scrambled word 
order relative to the canonical one. However, patients with Wernicke's aphasia did 
not exhibit any significant ERP effects. Thus, patients with Broca's aphasia, whose 
lesions are limited to the frontal regions, can sufficiently analyze case particles 
to comprehend the complex syntactic structures, whereas those with Wernicke's 
aphasia, whose lesions extend from the frontal to the temporal lobe, may have a 
functional disconnection for processing them. 

Chapters 12 and 13 present studies on the interaction of children's syntactic 
processing with information structure and prosodic cues, respectively. In Chapter 
12, *Experimental studies on clefts and right dislocations in child Japanese," Kyoko 
Yamakoshi and Hiroyuki Shimada examine various aspects of children's acquisi- 
tion of Japanese clefts (JCs) and Japanese right dislocations (JRDs). Although JCs 
and JRDs have similar non-canonical word orders (e.g., OVS), the first experiment 
showed that children performed differently between the two cases, indicating that 
children are aware of the difference between them. The authors suggest that chil- 
drer's good performance with JRDs is based on its information structure (Tomioka 
2021), and their poor performance with subject clefts reflects the agent-first strat- 
egy (Bever 1970; Hayashibe 1975). The second experiment revealed that children 
have an adult-like knowledge of the anti-reconstruction property of JCs and the 
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reconstruction property of JRDs. The third experiment established that Japa- 
nese-speaking children associate the focus particles dake/sika incorrectly in JCs and 
JRDs as English-speaking children do with *only" in the English language (Crain, Ni, 
and Conway 1994). Moreover, Japanese children exhibit an SO asymmetry, as found 
in English children. They argue that children’s incorrect associations in JCs and 
JRDs were not based on surface linear order but on hierarchical structures after 
reconstruction. 

In Chapter 13, *Developmental changes in the interpretation of an ambigu- 
ous structure and an ambiguous prosodic cue in Japanese," Yuki Hirose and Reiko 
Mazuka investigate whether adults and children abide by the same processing 
bias when encountering a global structural ambiguity and whether they share a 
common understanding of what certain prosodic phenomena signal in resolving 
the ambiguity. They first examine whether young children exhibit an adult-like pro- 
cessing bias (a local interpretation of a modifier), which supposedly results from an 
advantage in incremental processing. This problem is worth investigating because 
young children may not be as efficient as adults at processing continuous input as 
rapidly as it is received. Their second question concerns how children use prosodic 
information, particularly in the case of a prosodic signal potentially associated with 
two different roles; that is, as a signal to syntactic structure and as one indicating 
contrastive status. Hence, to investigate this issue, they consider one instance of 
branching ambiguity in Japanese. 

Finally, in Chapter 14, *Exceptive constructions in Japanese," Maria Polinsky, 
Hisao Kurokami, and Eric Potsdam discuss the syntax and semantics of exceptives 
in the Japanese language relative to the English language and provides suggestions 
for future experimental research. Exceptives are constructions that express exclu- 
sion, as in *Everybody laughed except Mary." The authors present and analyze the 
expression of exception in Japanese, formally marked with the postposition igai. 
As a postposition, igai combines with a noun phrase, the internal structure of that 
noun phrase can be quite complex; it can include a nominalized CP. The Japanese 
language allows for connected and free exceptives, which differ per whether the 
exception and its associate form a constituent (yes for the former and no for the 
latter). Polinsky and her colleagues show that the Japanese free exceptives always 
include an underlying nominalized CP (sometimes headed by a null nominal 
head) with elided material. This kind of ellipsis differs from clausal ellipsis in the 
exceptives in languages such as English, where no nominal or determiner head is 
attested. The Japanese language adds novel data to the observation that the original 
constraint on universal quantifiers in the associate of an exceptive is excessively 
strong. 
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Chapter 2 
High sense of agency versus low sense of 
agency in event framing in Japanese 


1 Introduction 


A transitive event (e.g., a boy kicking a man), whether in the actual world or a 
depiction, can be apprehended in perception and interpreted via language as 
active (e.g., kick) or passive (be kicked). Interpreting a depicted event, which 
involves framing it by identifying the thematic roles of the entities and their 
animacy features and adopting an agent or patient perspective, occurs within 
the first few hundred milliseconds of viewing it (Castelhano and Henderson 2007, 
2008; Dobel et al. 2007; Hafri, Papafragou, and Trueswell 2013; Zwitserlood et al. 
2018). Perspective adoption varies per individual differences, such as the level of 
empathetic or narrative engagement, and egocentric or allocentric bias (Brunyé 
et al. 2016; Hartung, Hagoort, and Willems 2017; Vukovic and Williams 2015). 
However, little attention has been paid to how the internal state of language users 
may influence language processing; hence, this chapter considers whether the 
internal factor of a sense of agency (SoA) could also be pivotal in event framing 
during language processing. 

Sense of agency can be defined as “the feeling of control over actions and their 
consequences" (Moore 2016: 1) or *the registration that I am the initiator of my 
actions" (Synofzik, Vosgerau, and Voss 2013: 1). That is, SoA is induced when one's 
intentions and actions match (Sidarus et al. 2017). Prior studies have attempted 
to develop implicit and explicit measures of SoA or focused on considering it the- 
oretically (for a review, see Moore 2016). Haggard, Clark, and Kalogeras (2002) 
demonstrated that when people voluntarily pressed a key and then heard a tone 
250 ms later, they perceived the time interval between action and tone as shorter 
than its actual duration; however, it was not the case when the action was invol- 
untary (ie. induced using transcranial magnetic stimulation). This “intentional 
binding” (IB) effect—intended actions accompanied by an SoA—is now widely 
used as an implicit measure of SoA (e.g., Demanet et al. 2013; Hascalovitz and 
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Obhi 2015; van der Westhuizen et al. 2017). Subsequent studies show that active 
movements induce stronger IB effects than passive or involuntary ones, indicat- 
ing that initiating intended actions stimulates or increases SoA (e.g., Borhani, Beck, 
and Haggard 2017; Engbert, Wohlschlager, and Haggard 2008; Engbert et al. 2007; 
Farrer, Valentin, and Hupe 2013; Moore, Wegner, and Haggard 2009). 

The current chapter investigates how the language-user-internal factor of SoA 
(Moore 2016) affects the early stages of language comprehension processes, includ- 
ing how action events are conceptually apprehended and subsequently interpreted 
for sentences using the active versus passive voice under an assumption that per- 
spective is reflected in voice preference. We compare the linguistic behavior of 
Japanese speakers with different SoA levels, either manipulated (Experiment 1) or 
intrinsic (Experiment 2). 

In Experiment 1, we examine whether involvement in motor activities affects 
language users’ SoA Oe, manipulated SoA), as manifested in participants’ event 
interpretation and reflected in subsequent reactions to active- versus passive-voice 
verbs used to describe the event in Japanese. We hypothesize that language com- 
prehenders' access to relational and perspective information regarding depicted 
characters is sensitive to the language comprehenders' embodied information. If 
so, their motion engagement before viewing a picture should influence their under- 
standing or interpretation of the depicted event. This hypothesis is derived from 
previous studies showing a tight link between motor activities and language com- 
prehension. For instance, Glenberg, Sato, and Cattaneo's (2008) participants spent 
approximately 20 minutes moving beans, one by one, toward or away from their 
bodies, which affected their subsequent comprehension of sentences describing 
toward or away motions. Additional support comes from studies demonstrating 
that stimulus perception influences subsequent motion (Bradley et al. 2001; Eder 
and Klauer 2007; Imbir 2017). For instance, positive stimuli facilitate the execu- 
tion of *approach" behaviors, such as pulling motions, while negative stimuli elicit 
faster *avoidance" behaviors, such as pushing actions, a pattern found for positive 
and negative words (Chen and Bargh 1999), pictures of spiders (Rinck and Becker 
2007), and pleasant and unpleasant pictures (Hillman, Rosengren, and Smith 2004; 
Saraiva, Schuur, and Bestmann 2013). Though these studies suggest that movement 
and linguistic processing are conducted by the same or linked systems, whether a 
single, non-repeated action can immediately affect subsequent event perception 
and linguistic processes warrants further study. 

Experiment 2 uses an IB task (Haggard, Clark, and Kalogeras 2002) to measure 
participants' intrinsic (unmanipulated) SoA and explores whether differences in 
intrinsic SoA levels affect event apprehension processes. Relative to people with 
low SoA, people with high SoA should pay more attention to agents and show an 
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increased interpretation of events from the agents' perspective, facilitating their 
comprehension of events described in the active (versus passive) voice. 


2 Experiment 1 


Prior research investigating the link between motion and other behavior tends to 
utilize repetitive or continuous motion rather than a single or punctuated motion 
(Glenberg, Sato, and Cattaneo 2008). It raises the question of whether the execution 
of a simple motion can immediately influence language comprehenders’ SoA and 
flexibly change the subsequent process of conceiving and interpreting an event, 
and, if so, in what way. This study examines the functional role of physical activity 
that may influence SoA and the framing of a subsequently encountered transitive 
event. We employed a 3 x 2 experimental design, where participants experienced 
pulling, being pulled, or remaining static, and then saw a depiction of a transitive 
event followed by a written description of the event via active or passive verb 
forms in Japanese. 


2.1 Hypotheses 


The two hypotheses are based on a general assumption that motion manipula- 
tion affects comprehenders' internal state and subsequent cognitive processes. 
The first hypothesis is an egocentric-agency account, which posits that event 
interpretations are motion-specific; that is, the type of motor activity in which lan- 
guage comprehenders are engaged affects their SoA in different ways, with poten- 
tially different effects on subsequent event perception. For example, voluntary 
or self-generated motion (i.e., pulling) in the Pull-Agent condition increases SoA, 
which conceptually highlights the agent figure, facilitating event interpretation 
from the perspective of the agent and active language processing (e.g., kicking). 
Involuntary or non-self-generated motion (i.e., being pulled) in the Pull-Patient 
condition reduces SoA, making the agent figure less prominent, hindering inter- 
pretation of the event with active voice, and facilitating passive language process- 
ing (e.g., being kicked). Without prior physical motion (i.e., the Static condition), a 
default interpretation of transitive events should emerge, which may reveal an 
underlying perspective preference, such as a preference for active language. An 
egocentric-agency account predicts that the Pull-Agent condition will elicit faster 
processing of active than passive verbs relative to the Pull-Patient and Static 
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conditions, and the Pull-Patient condition will elicit faster processing of passive 
than active verbs. 

The second hypothesis is a general-agency account, which claims that event 
interpretation is not motion-specific and postulates a more general facilitatory 
effect of motion on active language processing; that is, SoA will be enhanced when- 
ever intentional actions are detected (e.g., when the participant or experimenter 
is the agent of a pulling action). In this view, it is the participant's detection of 
the occurrence of an intentional action, regardless of whether the participant is 
the agent or patient of that action, that will enhance the participant's SoA, which 
increases the agent's prominence in the event, thus inducing the participant to take 
an agent perspective, facilitating an active-voice interpretation. A general-agency 
account predicts that active verbs will be processed significantly faster than passive 
ones in Pull-Agent and Pull-Patient conditions, and the effect will be greater than in 
the Static condition (when no intentional action can be detected). 


2.2 Methods 
2.2.1 Participants 


There were 30 participants (four males and 26 females). All were undergraduate 
students in Japan (average age: 21.45, SD - 0.62), native speakers of Japanese with 
normal or corrected-to-normal vision, and right-handed. 


2.2.2 Picture materials 


We created 24-line drawings of transitive events in which an agent physically acts 
on a patient. Attention toward characters in scenes has been shown to affect event 
interpretation patterns, and descriptions of such events allow structural alterna- 
tions between the active (tataku *slap") and passive (tatakareru *being slapped") 
voice in Japanese (Gleitman et al. 2007). For the characters in the pictures, we 
employed three male and three female figures Oe, boy, man, elderly man, girl, 
woman, elderly woman). The characters in each drawing were always two males or 
two females, and, in half the drawings, the agent of the transitive event was on the 
right, with the patient on the left. In the other half, the agent was on the left, and the 
patient, the right. The size of the agent and patient figures was balanced (average 
39.95 cm? and 39.11 cm’, respectively; this difference was not significant: t = 0.25, 
p = .802) to ensure that the image size would not affect participants’ perspective 
preference. Beyond the 24 target stimuli, we created 24 line drawings of transitive 
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events as fillers with the same specifications as the target pictures. An additional 
12 pictures of transitive events were created for a practice session. 


2.2.3 Verb materials 


The experiment used 24 critical verbs, each presented in a counterbalanced 
fashion in active or passive voice, and 24 filler verbs. The critical verbs described 
the depicted events and were equally natural in the active (e.g., keru *kick") and 
passive (e.g., kerareru *being kicked") voice. The filler items were 12 active- and 12 
passive-voice transitive verbs that did not correctly represent the events depicted 
in the paired pictures. Two of the authors (native Japanese speakers) confirmed 
that each filler verb was semantically unrelated to the pictured event, and that all 
critical verbs in active and passive voices appropriately represented the pictured 
events with which they were paired. Of the 12 transitive verbs (six active and six 
passive) used for the practice session, half matched the depicted event. 


2.2.4 Experimental design 


The experiment had a 3 (Motion: Static/Pull-Agent/Pull-Patient) x 2 (Voice: Active/ 
Passive) repeated-measures design. Six experimental lists were created using a con- 
dition rotation determined by a Latin-square design such that each participant saw 
each target picture only once, and no participant saw the same verb more than 
once. 


2.2.5 Procedure 


Participants took part in the experiment individually in a quiet room. They sat 
in front of a computer holding one end of a 15-inch stick with their right hand, 
while an experimenter, who held the other end of the stick, sat across the table. 
A partition between the participant and the experimenter prevented them from 
seeing each other. Each trial started with the screen displaying a yellow star for 
4000 ms. As soon as the star appeared on the screen, the participants placed and 
rested their right hand (still holding the stick) on a star-marked mouse pad, which 
was placed on the right side of the computer. A fixation cross with either a green 
or a gray background then appeared at the upper center of the screen for 3000 
ms. It appeared at the upper center instead of the middle center to ensure that 
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the location of the participants’ gaze was not biased toward either the agent or 
the patient figure when the fixation cross was replaced by the picture. Partici- 
pants were instructed to pull the stick when the background was green and to 
do nothing but remain still when the background was gray. We used colored 
Screens to elicit specific actions to avoid any plausible influence from lan- 
guage-based instruction (e.g., *please pull the stick"). When the background was 
gray, the participants were sometimes pulled by the experimenter, allowing for the 
implementation of three types of motion manipulations to create the Pull-Agent, 
Pull-Patient, and Static conditions. (This implementation links Pull-Agent [Pull-Pa- 
tient] conditions with a motion toward [away from] the participant's body; see 
Section 2.3.) 

Immediately following the offset of the green or gray screen, a transitive-event 
picture with a white background appeared in the screen's center for 1000 ms. 
Immediately after the picture disappeared, a word (verb) appeared in the center 
of the screen in red characters in the active (e.g., keru *kick": Active-Voice condi- 
tion) or passive (e.g., kerareru *be kicked": Passive-Voice condition) voice. Partic- 
ipants decided whether the verb described the pictured event they had just seen 
and indicated their response as quickly as possible by pressing (on an external 
keyboard) the “1” key for “yes” and the “2” key for “no.” As participants held the 
stick with their right hand, they pressed the key with their left hand. Target pic- 
tures were always paired with a correct description in active or passive form, while 
filler pictures were always paired with an incorrect, active or passive, semantically 
unrelated description. Therefore, the expected response for all the target trials was 
*yes" (the *1" key), regardless of voice (active or passive). 

Participants received no instruction regarding the voice variation. Verb ver- 
ification responses and times between the word onset and the key press were 
recorded and analyzed as accuracy rates and response times (RT), respectively. 

The experiment began with a practice session where the participants received 
instructions and completed 12 practice trials. The main session comprised 24 crit- 
ical items (expected to elicit *yes" responses) and 24 filler items (expected to elicit 
*no" responses), with the trial order randomized. The entire session lasted approx- 
imately 20 minutes. AU participants gave informed consent before participating 
and were compensated with a 500-yen gift card (approximately US$5 at the time 
of testing). 
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2.3 Results 
2.3.1 Data analysis 


The RT data from the target trials were analyzed using linear mixed-effects 
models, with participants and items as random factors (Baayen, Davidson, and 
Bates 2008). We included Voice (Active/Passive) and Motion (Static/Pull-Agent/ 
Pull-Patient) as fixed effects with interactions between the factors. Voice condi- 
tions were deviation-coded, and Motion conditions were treatment-coded, with 
the Static condition as the reference level. We included an additional main predic- 
tor (Position: the position of the item in the sequence seen by the participant) in 
the model, without interactions with Voice and Motion (e.g., Brown, Savova, and 
Gibson 2012). We conducted backward model comparisons and included random 
slopes for (primary) fixed factors only if they improved model fit at p « .20. The 
R programming language (R Core Team 2017) and the Imer function within the 
ImerTest package (Kuznetsova, Brockhoff, and Christensen 2017) were used for 
the analyses. Before the analyses, target trials in which RT was longer than 3000 
ms (0.696 of the target data) were excluded from the data. We also excluded trials 
in which the participant's response was incorrect. The overall accuracy rate for 
the task was 98.6%. A logistic mixed-effects model analysis performed on the accu- 
racy data in a similar way to the RT analysis found no significant main effects or 
significant interaction between Voice and Motion (ps » .50). 


2.3.2 Response times 


Figure 1 shows the RTs for each condition predicted by the final linear mixed-ef- 
fects regression model, and Table 1 shows the results of the statistical analysis. 
The linear mixed-effects model analysis shows that the interaction of Voice and 
Pull-Agent-Motion (p = .036) and the interaction of Voice and Pull-Patient-Motion 
(p = .026) were significant. In both cases, the interaction effect coincided with 
shorter RTs for active-voice verbs, a pattern inconsistent with an egocentric-agency 
account and consistent with a general-agency account. There was also a significant 
effect of Position, indicating that RTs tended to become shorter as the experiment 
progressed. The results do not support the prediction of a preference for active over 
passive verb forms in the Static condition. A follow-up analysis for the interactions 
revealed that simple effects of Voice were significant in the Pull-Agent condition (p = 
69.13, SE = 24.37, t = 2.84, p = .005) and Pull-Patient condition (B = 73.21, SE = 24.39, 
t = 3.00, p = .003) but not in the Static condition (B = 0.78, SE = 24.55, t = 0.03, p = .975). 
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Figure 1: Predicted response times (in milliseconds) for each Motion and Voice condition. Error bars 
denote +SEs. 


Table 1: Results of the linear mixed-effects model analysis for response times. 


B SE t p 
Intercept 825.08 28.74 28.71 <.001 
Voice 0.78 24.55 0.03 .975 
Pull-Agent-Motion 12.00 16.25 0.74 .461 
Pull-Patient-Motion 4.10 16.27 0.25 .801 
Voice x Pull-Agent-Motion 68.36 32.50 2.10 .036 
Voice x Pull-Patient-Motion 72.43 32.55 2.23 .026 


Position -34.32 6.90 -4.97 <.001 


2.4 Discussion 


This experiment explored whether an SoA generated by physical motion could 
predict language users' selection of perspective in interpreting a transitive event. 
The finding of comparable RTs for the two pulling conditions is compatible with 
the general-agency account. That is, intentional pulling actions performed by the 
participant or experimenter created measurable differences in preferences for 
agent perspective adoption. Importantly, this advantage for the agent perspective 
in event interpretation was observed regardless of motion type, indicating that it 
was not the participants' intention, but their detection of intention, that induced 
enhanced SoA. 
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Contrary to the general assumption that active forms are processed faster than 
passive forms, we found no RT difference between active and passive forms in the 
Static condition. There are two plausible explanations. First, the experiment presented 
isolated verbs (with no explicit subject), which might not elicit a general perspective 
preference in interpreting transitive events. This explanation is compatible with prior 
findings that Japanese language comprehenders do not have a default perspective in 
representing a described event when no grammatical pronoun explicitly specifies the 
perspective (Sato and Bergen 2013). Second, the Static condition, which was supposed 
to show any underlying preference, may have worked instead as a restricted condi- 
tion, forcing participants to restrict their motions in a third of the trials, which could 
have reduced their SoA, facilitating a patient perspective in event interpretation and 
boosting the passive preference in language processing. 

The interpretation of the observed results is, however, complicated by the fact 
that volition and motion directionality were involved, as the two experimental con- 
ditions induced hand movements in opposite directions: participants always pulled 
the stick in the Pull-Agent condition, while the stick was always pulled by the exper- 
imenter in the Pull-Patient condition. The simplest interpretation of the results is 
that neither volition nor motion directionality affects the perspective from which 
language comprehenders perceive an event, but detecting intentional actions 
(regardless of directionality) does.’ In this interpretation, motion itself, regardless 
of volition, will increase the likelihood that a participant will take the agent’s per- 
spective, relative to a situation without motion. With either type of motion, the 
motor information that the body experiences or detects is internally activated and 
enhances one’s SoA, inducing the framing of a subsequently encountered event 
from the perspective of the agent. If either directionality or volition is a crucial 
factor in perspective adoption when a person is perceiving an event, the Pull-Agent 
Oe, the voluntary motion of the hand toward the body) and the Pull-Patient (i.e., the 
involuntary motion of the hand away from the body) conditions would be expected 
to produce different patterns of effects on the RTS for the target words. However, 
these two motion conditions showed the same pattern (i.e., faster RTs for active 
than passive verbs relative to the no-motion condition). The results suggest that 
neither volition nor direction of the motion plays a functional role in selecting a 


1 Another possible interpretation is that the effect of motion directionality also interacted with the 
effect of participants’ intentionality Oe, sense of agency), obscuring either factor’s independent 
effects. It is also possible that, motion effects on perspective adoption vary depending on the type 
of action represented (e.g., an away action such as kicking vs. a toward action such as holding). 
We used verbs for four toward actions, seven away actions, and thirteen no-direction actions (e.g., 
wiping, stepping on). To disentangle their effects, further experiments will need to control direc- 
tionality independently of intentionality and other factors (e.g., action type). 
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perspective, but detecting intentional actions stimulates a language comprehender 
to adopt the agent perspective. 


3 Experiment 2 


Experiment 1 tested whether and how bodily movements influence SoA, appre- 
hension of transitive events involving two people Oe, an agent and a patient of 
an action), subsequent event interpretation, and language processing of the active 
versus passive voice in Japanese. Experiment 2 investigated whether different 
levels of intrinsic SoA affect perspective adoption and subsequent linguistic behav- 
ior. It had no motion manipulation but, otherwise, replicated Experiment 1. A tran- 
sitive event was presented on a screen, followed by an active or passive verb, and 
participants judged whether they matched. 

Experiment 2 was inspired by Oren, Friedmann, and Dar (2016), who showed 
influences of obsessive-compulsive (OC) tendencies on the choice of sentence 
voice. OC disorder is *marked by intrusive and disturbing thoughts (obsessions) 
and repetitive behaviors (compulsions) that the person feels driven to perform" 
(Goodman et al. 2014: 257). The feeling that one's actions are compelled rather 
than chosen is suggestive of a reduced SoA, and individuals with OC tendencies 
have been shown to have reduced SoA in research using a common non-linguistic 
measure (IB), which we adopted for Experiment 2 (Oren, Eitam, and Dar 2019). 

While the effects of an intrinsic SoA on the process of perceiving events remain 
unexplored to the best of our knowledge, Oren, Friedmann, and Dar (2016) suggest 
the possibility of such effects. In their sentence-production experiment, partici- 
pants saw a picture of a transitive event (e.g., an elderly woman covers a girl with a 
blanket) and answered a question (e.g., “Why is the girl happy?"). In general, speak- 
ers attend to agents more than patients when perceiving a transitive event (agent- 
first principle or agent advantage; Cohn and Paczynski 2013; Jackendoff 1992), and 
speakers generally mention the more activated concept (the agent) earlier in the 
sentence than the less activated one (the patient), inducing the production of more 
active constructions than passive ones. Oren, Friedmann, and Dar (2016) find that 
participants with high OC tendencies were significantly more likely to produce sen- 
tences in which the agent of the event was omitted (e.g., “The girl is happy because 
she has a blanket" rather than *The girl is happy because the grandmother is cov- 
ering her") than a low OC group. The high OC group also tended to produce more 
passive constructions (e.g., “This is the boy who is being tickled” rather than “This 
is the boy that the grandfather is tickling") than the low OC group. They concluded 
that high OC tendencies are correlated with attenuated SoA, indicating that a prefer- 
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ence for agent omission and passive constructions reflects reduced SoA (cf. Duranti 
2004). That is, people with low SoA seem to have an increased tendency to omit 
agents and speak in passive constructions when describing events. Therefore, they 
may be less likely to attend to agents in event perception than people with high SoA. 

In this study, after the main experiment, we implicitly measured each partici- 
pant' intrinsic (i.e., unmanipulated) SoA level with an IB task (Haggard, Clark, and 
Kalogeras 2002). We hypothesized that SoA level would affect language compre- 
henders' preferential perception of events. If intrinsic SoA affects the event appre- 
hension process, people with high SoA should be more likely to interpret events 
from the agent perspective than those with low SoA. 


3.1 Method 
3.1.1 Participants 


A new group of 58 students (undergraduate and graduate) in Japan participated (41 
male, 17 female; average age: 20.78, SD - 2.23). All were right-handed and native 
speakers of Japanese with normal or corrected-to-normal hearing and vision. 


3.1.2 Picture and verb materials 


The experiment used the same picture and verb materials as Experiment 1. 


3.1.3 Procedure and intentional binding task 


Each participant completed first the main experiment and then the IB task, taking 
approximately 45 minutes. All participants gave informed consent before partici- 
pation and received approximately US$10 compensation. 

The procedure of the main task was identical to that of Experiment 1, except that 
Experiment 2 employed no manipulated motion. Each trial started with a 3000 ms fix- 
ation cross at the upper center of the screen. The fixation cross was replaced by a tran- 
sitive-event picture for 1000 ms. Immediately after the picture disappeared, a Japanese 
verb (in the active or passive voice) appeared at the center of the screen in red charac- 
ters. Participants were instructed to press the “J” key with their right hand if the word 
and the picture matched or the “F” key with their left hand if they did not. Target pic- 
tures were always paired with a correct description, in either active or passive form, 
while filler pictures were always paired with an incorrect (active or passive) form. 
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We measured each participant’s intrinsic level of SoA with an IB task (Haggard, 
Clark, and Kalogeras 2002), adopting van der Westhuizen et al.’s (2017) procedure. 
The task comprised four experimental blocks: Operant Action (OA), Baseline Action 
(BA), Operant Effect (OE), and Baseline Effect (BE). The participants experienced 
the blocks in one of two orders: OA > BA > OE > BE or OE > BE > OA > BA (van der 
Westhuizen et al. 2017). For each block, the participants received instructions and 
completed five practice trials and 30 experimental trials. 

Figure 2 illustrates the procedure of a trial. At the center of the screen, partic- 
ipants saw a clock face (2.8 cm diameter) on which one hand rotated at a speed of 
approximately 2500 ms per revolution, starting from a random position. In the OA, 
BA, and OE blocks, participants were instructed to press the keyboard’s spacebar at 
a time of their choosing. On pressing the key, the clock hand stopped rotating after 
a random period between 1500 and 2500 ms. In the OA block, a tone (1000 Hz, 100 
ms) sounded 250 ms after the key press; in the BA block, no tone sounded. In the 
OE block, a tone sounded 250 ms after the key press; in the BE block, participants 
were instructed not to press any key, and the tone occurred 1600-3600 ms (randomly 
determined) after the clock hand started to rotate. In the OA and BA blocks, partici- 
pants were asked to report on the position of the clock hand when they pressed the 
key by entering a number between 1 and 60 on the keyboard. In the OE and BE blocks, 
participants reported on the position of the clock hand when the tone sounded. 


Display 


Start rotating 


Participants report 
š j where the clock hand 
Block Action Shift was when... 
OperantAction (OA) e—— m t 
i they pressed the key 


-> 


Baseline Action (BA) e——— N95 —— > 


Operant Effect (OE) e€————V— — — I 
they heard the tone 


Baseline Effect (BE) e——————— ———— — —— — ——— ——À- H8— 


<> 
Tone Shift 
W Actual time of key press Wl Actual time of tone 
Total Shift =Action Shift +Tone Shift 
N/ Reported time of key press Ë Reported time of tone 


Figure 2: Schematic diagram of the intentional binding task. (Adapted from Niikuni, Nakanishi, and 
Sugiura, 2022.). 
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The participants’ responses and the times at which the key press or tone 
occurred were recorded, and the participants’ “judgment errors” for each trial 
were calculated. The judgment error was defined as [time participants reported 
at the end of the trial] — [actual time at which the key press/tone occurred in the 
trial]. For example, in the OE or BE block, if the tone occurred when the clock 
hand pointed at “15” and the participant's response was “11”, the error was “—4” 
(= 11-15). This number was converted to the actual time (in milliseconds). 

For each participant, trials in which the judgment error was above or below the 
block mean by 2.5 SD (1.696 ofthe data) were excluded from the data. We calculated 
the Total Shift (in msec), defined as the total Action Shift (mean judgment error in 
OA block - mean judgment error in BA block) and (-1)*Tone Shift (mean judgment 
error in OE block - mean judgment error in BE block), for each participant. The 
mean Total Shift for all participants was 144.2 ms (SD = 90.4 ms). Corresponding 
to typical results of IB tasks (see Haggard 2017; Moore and Obhi 2012), the mean 
Action Shift was positive (32.1 ms, SD - 36.8 ms), and the mean Tone Shift was neg- 
ative (-112.1 ms, SD = 81.3 ms). This pattern indicates that when the intentional 
action (key press) was followed by the consequence (tone), the perceived time of 
the action was later than when the action was not followed by a consequence (i.e., 
no tone), whereas the perceived time of the consequence was earlier compared to 
when it did not follow an action (no key press). Following Haggard's (2017) claim 
that the degree of shortened time perception between action and outcome (i.e., IB 
effect) reflects an individual's SoA, we used the Total Shift as a measurement of 
individual differences in participants’ intrinsic SoA. 


3.2 Results 
3.2.1 Data analysis 


First, we excluded target trials in which the participants produced incorrect 
responses or the RT was longer than 1500 ms (0.8% of the target data)? The 
overall accuracy rate was 97.4%. A logistic mixed-effects model analysis found 
no significant main effects or significant interaction (ps > .25) between Voice and 
Total Shift (intrinsic SoA). We then analyzed the RT data from the target trials 


2 Different cut-off times for data exclusion in Experiment 1 (3000 ms) and 2 (1500 ms) were cho- 
sen because overall RTs were slower in Experiment 1, possibly due to the task requiring hand 
movements. Another possibility for longer RTs in Experiment 1 is that participants used their left 
hand for yes and no key presses in Experiment 1 while using their right hand to indicate “match” 
responses in Experiment 2. 
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using linear mixed-effects models in the same manner as in Experiment 1, with 
Voice (Active or Passive) as a fixed effect and Total Shift as a continuous predictor. 
Voice conditions were deviation-coded, and Total Shift values were standardized 
to z-scores. 


3.2.2 Response times 


Table 2 shows the results of the statistical analysis. Voice (p « .001) exhibited 
a significant main effect in the linear mixed-effects model, indicating that par- 
ticipants reacted faster to the verb in the active- than passive-voice condition. 
More importantly, the analysis showed a significant interaction between Voice 
and Total Shift (p = .009). As a follow-up analysis for this interaction, we tested 
the simple effects of Voice for participants with negative versus positive z-scores 
for Total Shift. This analysis revealed that although participants with relatively 
low and high SoA reacted significantly faster to active-voice than passive-voice 
verbs, this tendency was more moderate for participants with low SoA (B = 39.07, 
SE = 11.32, t = 3.45, p = .001) than high (B = 79.29, SE = 11.51, t = 6.89, p < .001). 
Figure 3 shows the RTs for each condition predicted by the final linear mixed- 
effects regression model. 

We also tested the simple effects of Total Shift for each Voice condition. The anal- 
ysis revealed that a simple effect of Total Shift was significant in the passive-voice 
condition (B = 31.92, SE = 14.86, t = 2.15, p < .036) but not in the active-voice condi- 
tion (B = 11.80, SE = 12.40, t = 0.95, p = .345). Participants with relatively low SoA 
reacted faster to passive-voice verbs than those with high SoA, but this RT differ- 
ence did not appear for active-voice verbs. 


Table 2: Results of the linear mixed-effects model analysis for response times. 


B SE t p 
Intercept 600.94 14.44 41.63 <.001 
Voice 59.18 8.30 7.13 <.001 
Total Shift 21.86 13.17 1.66 102 
Voice x Total Shift 20.12 7.45 2.70 .009 


Position -7.23 3.21 -2.25 025 
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Figure 3: Predicted response times (in milliseconds) for Voice conditions 
by Total Shift (-SD/+SD). Error bars denote +SEs. 


3.3 Discussion 


We asked whether intrinsic SoA as measured by the IB task would elicit different 
types of perspective adoption (agent vs. patient) in interpreting a transitive event. 
We hypothesized that if intrinsic SoA affects the event apprehension process, 
people with high SoA would be more likely to take the agent perspective than those 
with low SoA. Consistent with the SoA effect on event apprehension, the Voice x 
Total Shift interaction was significant. Participants with low SoA showed a smaller 
difference in RTs between the two voice conditions than those with high SoA, who 
responded faster to active-voice than passive-voice verbs. 

Experiment 2 also revealed a significant effect of Voice, with shorter RTs for 
active-voice verbs. This effect may have stemmed from a strong preference for 
the agent perspective in transitive-event apprehension (Griffin and Bock 2000; 
Meyer, Mack, and Thompson 2012) or for unmarked active-voice verbs (relative to 
inflected passive ones) in word recognition (Yokoyama et al. 2006), leading to active 
verbs being processed more quickly than passive ones, even in our experimental 
paradigm. 

Furthermore, the results indicated that intrinsic SoA positively predicts RTs for 
passive-voice verbs but not for active-voice verbs. If our interpretation is correct, 
SoA would negatively predict RTs for active-voice verbs and positively predict RTs 
for passive ones. However, without a baseline condition predicted to be equivalent 
in individuals regardless of SoA, it is challenging to ascertain exactly what accounts 
for the significant interaction, warranting further investigation. 
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4 General discussion 


Recent studies investigate the role of individual differences in event interpretation 
(Brunyé et al. 2016; Hartung, Hagoort, and Willems 2017; Vukovic and Williams 
2015), and comprehenders' internal status (i.e., SoA) and emotions are known to 
influence the perception and memory of events (Havas, Glenberg, and Rinck 2007). 
However, this study is evidently the first to explore the role of a non-linguistic, inter- 
nal factor, SoA, in event framing and language comprehension. We considered two 
questions: (i) whether simple physical motions fluidly change the SoA and affect a 
subsequent process of event interpretation (Experiment 1), and (ii) how individual 
differences in the intrinsic SoA influence perspective adoption in interpreting a 
transitive event (Experiment 2). 

The results from the two experiments, one with manipulated SoA, and the 
other, intrinsic SoA, have implications regarding the role of motion in SoA and 
event interpretation. Experiment 1 supports the hypothesis that motor informa- 
tion (self- or other-generated) that the body experiences is internally detected and 
enhances the SoA, increasing the prominence of agent figures in events and leading 
participants to respond to the provided active verbs more quickly, thus reflecting 
the participants' framing of events from the agent's perspective. The Experiment 
1 results accord with the results from the participants with high SoA in Experi- 
ment 2, supporting the assumption that the motor manipulations in Experiment 
1 enhanced participants' SoA. Moreover, the results from the Static condition in 
Experiment 1 and the relatively low SoA participants in Experiment 2 suggest that 
SoA might be significantly lower when motion is hindered than when motion is not 
manipulated. 

This study's participants were native Japanese speakers. How motion influ- 
ences SoA and subsequent processes of event interpretation may vary across lan- 
guages. Systematic and consistent ways of describing the world linguistically may 
play a causal role in cognition, influencing how people apprehend events (Trueswell 
and Papafragou 2010), pay attention to event participants (e.g., agents and patients; 
Fausey and Boroditsky 2006; Fausey et al. 2010), remember events (Gentner and 
Loftus 1979; Loftus and Palmer 1974), and favor explanations of causal attribution 
(Choi and Nisbett 1998; Choi, Nisbett, and Norenzayan 1999). Such connections 
would allow for a feedback loop, in which people interpret events in a certain way, 
influencing descriptions, which then influence subsequent apprehension. English 
and Japanese, for instance, are useful for comparative purposes as they differ in the 
frequency of agentive versus non-agentive and transitive versus intransitive expres- 
sions. As Choi (2009) observed, Japanese speakers use non-agentive expressions 
more frequently than English speakers when an event (e.g., dropping keys) equally 
allows for agentive (e.g., Kagi-wo otoshita “[I] dropped [the] keys") and non-agentive 
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(e.g., Kagi-ga ochita *[The] keys dropped") descriptions. Fausey et al. (2010) found 
that although English and Japanese speakers use equally agentive language to 
describe videos of intentional events (e.g., a person takes an egg from a carton and 
cracks it against a bowl), Japanese speakers are more likely than English speakers to 
use non-agentive language when describing accidental events (e.g., a person takes 
an egg from a carton and drops it so that it breaks). The different linguistic patterns 
in the two languages manifested beyond language use in the study: both language 
groups remembered the agents of intentional events equally well, but the English 
group remembered the agents of accidental events better. This pattern reflects an 
attenuation of Japanese speakers' attention to the agent in conjunction with their 
non-agentive linguistic descriptions of the accidental events. The findings indicate 
that cross-linguistic differences in habitual usage implant different cognitive pat- 
terns, which can lead speakers of different languages to focus on different aspects 
of events. Taking a particular perspective when interpreting an event is not a 
simple, fixed process; it is shaped by the language people use. 

Given that a general cognitive preference can vary across languages depending 
on various linguistic and extra-linguistic factors, it might be fruitful to examine 
whether and how languages that exhibit a preference for the agent or patient per- 
spective induce general cognitive differences in terms of SoA. For instance, lan- 
guages that preferentially use the agent perspective might tune speakers to be 
self-focused, enhancing SoA in general. Moreover, individual levels of SoA may play 
a role in speakers’ selection of syntactic structures. Within a language, people with 
relatively high SoA may tend to drop or demote patient arguments and, thus, be 
more likely to use anti-passives than those with relatively low SoA. However, people 
with relatively low SoA may tend to include or promote patient arguments. 

By examining languages that vary in linguistic properties, we may eventually 
build a comprehensive model of physical motion and event-encoding processes to 
better explain the roles of agentivity and embodiment information in our cognition. 


5 Conclusion 


This chapter compared the linguistic behavior of language comprehenders with 
different SoA levels, manipulated or intrinsic, when framing or interpreting a 
transitive event. We predicted that motion would affect the language-user-inter- 
nal factor of SoA, which could change subsequent cognitive processes, such as 
apprehending pictured events and internally representing linguistically described 
events. The findings showed an agent-perspective advantage in event interpreta- 
tion when participants experienced intentional actions, whether the actor was the 
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participant or an experimenter, but no such agent-perspective advantage when 
they remained motionless. Therefore, arguably, the detection of intentional action, 
not volitional action, enhances SoA, increasing agent prominence in transitive 
events, thus inducing an agent-perspective preference in interpreting such events. 

The study also demonstrated the effects of individual differences in intrinsic 
SoA, which drove the tendency to frame an event in a manner compatible with 
the active or passive voice; the degree of intrinsic SoA tends to correlate with a 
preference for active or passive language: people with relatively high SoA are more 
likely to interpret events from the perspective of an agent and show greater differ- 
ences in responses to language in the active versus passive voice than those with 
relatively low SoA. 
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Kentaro Nakatani 

Chapter 3 

Locality-based retrieval effects are 
dependent on dependency type: 

A case study of a negative polarity 
dependency in Japanese 


1 Introduction 


Thehuman working memory capacity is limited (Daneman and Carpenter 1980; Just 
and Carpenter 1992; Osaka and Osaka 1992; Daneman and Merikle 1996; Engle et al. 
1999; Conway et al. 2005). For example, the mean reading span score reported by 
Daneman and Carpenter (1980) (Experiment 1) was 3.15, meaning that, on average, 
participants were successful in storing and recalling slightly more than three 
words when a reading task interfered. Moreover, center-embedded structures such 
as (1a) are more challenging to understand than their right- or left-branching coun- 
terparts, such as (1b) (Yngve 1960; Chomsky and Miller 1963); furthermore, double 
center-embedding easily yields incomprehensible sentences, as illustrated in (2): 


(1 a The reporter who the senator attacked ignored the president. 
b. Thesenator attacked the reporter who ignored the president. 


(2  Thereporter who the senator who the congressman criticized attacked 
ignored the president. 


Intuitively, it is obvious that the challenge in the processing of a doubly center- 
embedded structure stems from the difficulty in keeping track of who does what in 
the event depicted in the sentence. It is, thus, natural to assume that when multiple 
grammatical relations must be simultaneously tracked in incremental processing, 
the memory load will be greater. Such a structural situation is likely to be found 
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when two words that constitute some grammatical relationships are separated, 
with other words intervening between them. The effect incurred by having a 
long-distance dependency is called a locality effect (Gibson 1998, 2000; Van Dyke 
and Lewis 2003; etc.). Locality effects are assumed to stem from an increase in the 
memory load because separating two words to be integrated adds to the number of 
incomplete dependencies that must be stored in memory (Gibson 1998), or it would 
make it more challenging to retrieve the antecedent at the tail of the dependency 
chain (Gibson 1998, 2000; Van Dyke and Lewis 2003), or both. 

Despite this apparently straightforward logic, the evidence for locality effects has 
been relatively weak (Bartek et al. 2011; Levy and Keller 2013). Some have reported 
locality effects, while others have reported anti-locality effects (i.e., speedup effects 
for having long-distance dependencies). From Table 1, among 17 previous studies 
on the effects of long-distance dependencies, approximately half reported locality 
effects, while the rest reported anti-locality effects or null results. Furthermore, 
anti-locality effects have been found mostly in subject-object-verb (SOV) languages 
such as Hindi, German, and Japanese. Why is this the case? 


Table 1: Summary of previous studies on locality effects (studies with an asterisk did not control for 
position effects). 


Language Dependencytype Main findings 


Safavi, Husain, and Vasishth (2016)* Persian Thematic Locality effects 

Bartek et al. (2011)* English Thematic / RC Locality effects 

Grodner and Gibson (2005)* English Thematic / RC Locality effects 

Levy, Fedorenko, and Gibson (2013)* Russian RC Locality effects 

van Dyke and Lewis (2003)* English Reanalysis Locality effects 

Vasishth and Drenhaus (2011) German RC Locality effects 

Ono and Nakatani (2015) Japanese — Wh-question Locality effects 

Nakatani (2021) Japanese Adverbial NPI Locality effects 

Phillips, Kazaninaa, and Abada (2005) English Wh-question Lower ratings / Delayed 

P600 

Nicenboim et al. (2016) German, RC Locality effects 

Spanish (high-capacity readers) 


Anti-locality effects 
(low-capacity ones) 


Vasishth and Lewis (2006)* Hindi Thematic / RC Anti-locality effects 
Husain, Vasishth, and Srinivasan. Hindi RC / Thematic Anti-locality effects 
(2014)* 

Konieczny (2000)* German Thematic Anti-locality effects 


Konieczny and Döring (2003) German Thematic Anti-locality effects 
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Table 1 (continued) 


Language Dependencytype Main findings 


Levy and Keller (2013) German Thematic Anti-locality effects 
RC Locality effects (with an 
adjunct) 
Nakatani and Gibson (2008)* Japanese ` Thematic Null results (trend toward 
speedup) 
Nakatani and Gibson (2010) Japanese Thematic Null results (slowdown at 


subject NPs) 


(RC: relative clause, NPI: negative polarity item) 


One non-trivial factor that may be partially responsible for the mixed results is 
the lack of control for potential position effects in some of the studies. It has been 
suggested that placing the critical word in different positions across the conditions 
would yield a so-called position effect because, generally, people tend to speed up 
as they proceed through a sentence (Ferreira and Henderson 1993). Thus, simply 
varying the distance of a dependency by putting more words in between and, thus, 
pushing the critical word to a later position is likely to facilitate the reading given 
the differences in the position in which the critical word is placed, independent of 
dependency length. This confound is often found in prior studies on locality effects 
(as pointed out by Nakatani and Gibson 2010 and Levy and Keller 2013; see also 
Table 1). Accordingly, this study's experimental designs employed scrambling oper- 
ations in Japanese to control for this potential confound. 

Another factor worth testing is the effect of dependency type. In SOV languages, 
the distance between a verb and its arguments, especially the subject, can easily be 
made greater because the subject is placed sentence-initially in canonical order, the 
verb is placed sentence-finally, and everything else comes in between. The situation 
surrounding the dependency length manipulation is different in subject-verb-ob- 
ject (SVO) languages because the verb is placed in the middle of the sentence, and 
adjuncts are usually placed in the right periphery, a non-interfering position in 
argument-predicate dependencies. This property seems to have induced the studies 
of locality effects in SVO languages to resort to the inclusion of extra grammatical 
dependencies such as a wh-gap relationship, where wh can be easily placed farther 
away from its original gap position. Indeed, in SOV languages, dependency length 
can be manipulated by varying a thematic dependency, whereas in SVO languages, 
the manipulation of dependency length often requires the inclusion of an extra 
dependency added to a thematic dependency. 

This contrast between SOV and SVO languages may have led to the contrast 
between the results of the studies of locality effects in SOV and SVO languages. This 
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chapter hypothesizes that argument-predicate (and adjunct-predicate) dependen- 
cies (henceforth, thematic dependencies) are less prone to memory decay while 
other extra grammatical dependencies such as wh-gap dependencies may be more 
likely to decay. One possible memory-based explanation for the purported con- 
trast between thematic dependencies and other grammatical dependencies may 
be provided by activation models, such as the CC READER model (Thibadeau, Just, 
and Carpenter 1982; King and Just 1991) and Vasishth and Lewis's (2006) activation 
decay model based on the ACT-R architecture (Anderson et al. 2004 and the refer- 
ences cited therein). According to these theories, information in working memory 
is assigned some activation level, and it decays from working memory as its activa- 
tion level decreases. For example, Vasishth and Lewis (2006) hypothesize that the 
activation level is a function of recency and the number of reactivations triggered 
by the members of the dependency. The cost of integration at the tail of a depend- 
ency is inversely proportional to the activation level of the dependency. 

We tentatively adopt Gibson's (2000) dependency integration theory, where 
dependencies are defined as relations between heads (rather than between a head 
and a phrase). We assume a head h, may trigger an initial expectation for another 
head w, in which case an incomplete dependency chain <h,, w> is set up in working 
memory; h; is the head of the dependency chain, and w is its tail. If another head h, 
is encountered and is also expected to be integrated with w, then the dependency 
chain «A, w> is reactivated and h, joins the chain, updating it as «Ih, ho, w>. If a 
head that is encountered is then qualified for fulfilling w, the dependency relations 
are fully integrated and discharged from working memory. This process is found 
in a simple case of thematic integrations in SOV language, illustrated below, where 
the dependency chain dc,, triggered by the subject John-ga with a predicted V head 
vl ], is stored in working memory and is incrementally joined by (integrated with) 
the other noun phrases (NPs), reactivated each time, and fully integrated when the 
verb is encountered: 


(3) de <j, w> <j, m, w> <j,m,b,w> <j, m, b, introduced> 


john ga Mary o Bill ni syookaisita 
NOM ACC DAT introduced 


Assumedly, if the ongoing processing of an incomplete dependency chain maintains 
its activation level in working memory, the members of this dependency chain can 
be accessed and recalled quickly. 

From this perspective, the wh-gap dependency offers a different picture. Con- 
sider a case of object-extracted relative clauses in English, such as in (4) below. 
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Here, focusing on the dependencies within the relative clause, two dependency 
chains are involved: the filler-gap dependency chain triggered by who (dc) and the 
thematic dependency triggered by John (dc). The thematic dependency chain dc; 
is activated until the final participant of the chain, t, is set up because all the heads 
in between integrate into the same dependency chain. However, the filler-gap dc, 
is stored in working memory without “maintenance support” (King and Just 1991: 
598) from intervening words, John and criticized, because they are independent of 
the A-bar chain formed by wh and t. 


(4) dc, <wh, w> > > <wh, t> 
de, <j, w> <j, criticized, oss <j, criticized, t> 
(the nurse) who John criticized t 


Thus, unlike thematic dependencies, filler-gap dependencies are more likely (if not 
necessarily) prone to memory decay when the tail of the chain is distanced away 
from the head. This chapter is primarily concerned with this issue. 


2 Negative polarity item dependency with 
sika in Japanese 


This study utilizes a novel type of dependency between a negative-sensitive excep- 
tive marker sika in Japanese and its obligatory licenser (i.e., verbal negative mor- 
pheme Neg), such as in (5) below, to test the hypothesis that locality effects are a 
function of dependency length and sensitive to the dependency type. 


(5)  tentyoo sika sore o { *sinzi-ta / sinzi-nakat-ta } 
store-manager SIKA it ACC {*believe-PAST /believe-NEG-PAST } 
“Nobody but the store manager believed it.” 


This sika-marked element works like English negative polarity items (NPIs) such as 
any, but, unlike English NPIs, it can appear in the subject position of the negated 
predicate. Furthermore, unlike in English, where Neg precedes an NPI, the licenser 
Neg in Japanese always follows the sika-marked NPI in linear order because Neg 
in Japanese is a verbal suffix, and Japanese is strictly verb-final. This property 
makes the NPI-Neg dependency comparable with filler-gap dependencies in that an 
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encounter with the NPI immediately opens a new incomplete dependency, trigger- 
ing an expectation for Neg.' 

Several different predictions can be made regarding the processing of a verb 
whose subject is marked with sika relative to the processing of the same verb 
preceded by a regular nominative-marked subject. First, because sika requires Neg, 
it may strengthen the expectation for the negated verb to come, speeding up the 
processing when the verb is encountered. However, from a retrieval perspective, 
adding an extra grammatical dependency may increase the cost of retrieving the 
sika-marked subject. In (5), the thematic relations between the verb root and the 
nominative and accusative arguments establish an affirmative proposition (the 
manager believed it), whereas the NPI relation between sika and Neg triggers an 
additional inference on exclusivity such that the proposition exclusively applies to 
the manager. These two dependencies should be of distinct types, and, thus, the acti- 
vation level of the ongoing processing of the sika-Neg dependency should decrease 
in proportion to the distance between sika and Neg, under the assumption that 
the processing of the NPI-Neg dependency would not receive maintenance support 
from the intervening elements (King and Just 1991; Vasishth and Lewis 2006). This 
hypothesis predicts some locality effects at the negated verb when the subject is 
distant and sika-marked, relative to the cases where sika is not involved or the cases 
where the subject is local to the verb region. Hence, we conducted two self-paced 
reading experiments to test these predictions, controlling for the position factor. 


3 Experiment 1 


The main goal of Experiment 1 was to test the hypothesis that locality effects are 
a function of dependency length and dependency type, comparing the processing 
of a verb whose subject was sika-marked and that of a verb whose subject was not 
sika-marked. 


1 Researchers such as Miyagawa, Nishioka, and Zeijlstra (2016) note that NP-sika is unlike NPIs 
in English in that the former obligatorily requires the presence of the Neg head and cannot be 
licensed semantically while the latter can be semantically licensed under a non-negative down- 
ward-entailing environment (Ladusaw 1979; Von Fintel 1999). They argue that NP-sika is better 
regarded as a negative concord item. Note that NP-sika is also different from negative concord 
expressions in English and other languages, such as I don’t have no money in that Japanese sika 
obligatorily requires checking by Neg. Given that this study does not address the question of wheth- 
er the sika marked element is a negative concord or polarity item, we tentatively stick to a more 
traditional term (negative polarity item) when referring to it. 
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3.1 Methods and materials 
Participants 


Participants comprised 51 native speakers of Japanese, mostly undergraduate stu- 
dents at a university in Japan. They received 800-yen compensation for their approx- 
imately 30-minute participation. 


Design and materials 


We prepared materials using a 2 x 4 factorial design, varying the Locality factor 
and the Dependency Type factor Regarding the Locality factor the distance 
was varied by scrambling. The Dependency Type factor was varied by different 
subject markers: NP ga “NP NOM,” NP dake ga “NP only NOM," or NP sika. Neither 
case marker ga nor exclusivity marker dake *only" requires a negative context. 
Thus, only the sika-marked subjects were obligatorily negative-sensitive.? Note 
that, semantically speaking, sika is similar to dake in that both denote exclusivity. 
The non-negative-sensitive dake conditions were added to ascertain whether they 
were semantic exclusivity or expectation for an obligatory licensing dependency 
that would induce locality effects. All the target sentences were further embedded 
as adjunct clauses (using either node or tame, both of which are suffixal conjunc- 
tions heading *because" clauses) to avoid potential wrap-up effects. A sample set 
of materials is shown below, where regions for presentation are shown by slashes. 
The Locality factor did not alter the interpretations of the sentences; thus, English 
translations for the Local conditions are not given. The crucial dependencies in 
this experimental design are shown in boldface. The matrix clause in which the 
target clause was embedded is shown in (6a) but omitted in the other conditions 
for brevity. 


2 When an NP is marked with sika, nominative and accusative case markers are obligatorily 
deleted, making the sika conditions slightly more ambiguous than others, but we assumed that 
sentence-initial sika phrases would likely be interpreted as subjects because of the canonical SOV 
order, especially when they referred to humans. 


38 —— Kentaro Nakatani 


(6) a. Nom x Distant 


tentyoo ga [ueetoresu ga | zyoorenkyaku 0 
manager NOM /waitress NOM /regularcustomer ACC 
[nagut-ta to / sinzi-nakat-ta node | hukutentyoo 


/hit-PAST COMP /believe-NEG-PAST because /assistant.manager 

wa [doo /de-tara /yoi noka |kangaeagune-ta. 

TOP /how /dealif /good Q / wonder-PAST 

*Because the manager did not believe that the waitress hit the regular 

customer, the assistant manager wondered how to properly deal with it." 
b. Only x Distant 


tentyoo dake ga /ueetoresu ga | zyoorenkyaku 0 
manager only NOM /waitress NOM /regularcustomer ACC 
[nagut-ta to / sinzi-nakat-ta node /... 


/hit-PAST COMP /Delieve-NEG-PAST because /... 
*Because only the manager did not believe that the waitress hit the 
regular customer, ..." 

c. sika x Distant 


tentyoo sika /ueetoresu ga | zyoorenkyaku 0 
manager SIKA /waitress NOM /regularcustomer ACC 
/nagut-ta to / sinzi-nakat-ta node ERE 


/hit-PAST COMP /believe-NEG-PAST because /... 
*Because nobody but the manager believed that the waitress hit the 
regular customer, ...” 

d. Nom x Local 
ueetoresu ga | zyoorenkyaku 0 [nagut-ta to 
waitress NOM /regularcustomer ACC /hit-PAST COMP 
/tentyoo ga / sinzi-nakat-ta node eg 
[manager NOM /believe-NEG-PAST because /... 

e. OnlyxLocal 
ueetoresu ga | zyoorenkyaku 0 [nagut-ta to 
waitress NOM /regularcustomer ACC /hit-PAST COMP 
/tentyoo dake ga / sinzi-nakat-ta node ES 
/manager only NOM /believe-NEG-PAST because /... 

f. sika x Local 
ueetoresu ga | zyoorenkyaku 0 /nagut-ta to 
waitress NOM /regularcustomer ACC /hit-PAST COMP 
[tentyoo sika /sinzi-nakat-ta node Pee, 
[manager SIKA /believe-NEG-PAST because /... 
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Note that even though the meaning of sika is comparable to that of dake in exclu- 
sivity, it works in the opposite direction regarding truth conditions because NP 
sika, when properly licensed, creates an affirmative context for the sika-marked 
element Oe, “nobody but X" is affirmative regarding X). Thus, (6c) means “only 
the manager believed," whereas (6b) means *only the manager did not believe." 
Although this study is concerned with the effects of negative polarity dependency 
and locality, not the effects of negation itself (cf. Yoshida 2002), affirmative versions 
of the dake conditions were also included as another type of grammatical relation 
for comparison. 


(6) g. OnlyAff x Distant 
tentyoo dake ga /ueetoresu ga | zyoorenkyaku 
manager only NOM /waitress NOM /regular.customer 
0 [nagut-ta to /sinzi-ta node hw. 
ACC /hit-PAST COMP /believe-PAST because /... 
*Because only the manager believed that the waitress hit the regular 


customer..." 

h. OnlyAff x Local 
ueetoresu ga | zyoorenkyaku 0 /nagut-ta to 
waitress NOM /regularcustomer ACC /hit-PAST COMP 
/tentyoo dake ga / sinzi-ta node Iss 


[manager only NOM /believe-PAST because /... 


The truth-conditional semantics of (6g, h) are comparable to those of (6c, f). These 
conditions were included to tease apart the effects of the Dependency Type factor 
and the truth-conditional semantics. In an ideal world, we could have prepared the 
affirmative conditions for the Only and other conditions, adopting a 2 x 3 x 2 facto- 
rial design ({Distant/Local} x (Nom/Only/sika) x {Neg/Aff}), though doing so would 
raise the number of conditions to 12, which is practically challenging to implement. 
Moreover, the combination of sika x Aff is ungrammatical in the first place. Thus, 
we included the affirmative versions of the Only conditions only, treating them as 
another level in the Dependency Type factor, labeled OnlyAff. Note that the critical 
verb region in the OnlyAff conditions lacked a negative morpheme, making this 
region shorter and less complex than the same region in the other Dependency Types, 
all of which involved Neg. Therefore, the interaction with the Locality factor would 
be the only target issue regarding OnlyAff. 32 sets of items as exemplified in (6a-h) 
were constructed and distributed into eight lists, using a Latin Square design, and 
96 filler items were added to each list, among which 54 items were from three unre- 
lated experiments, and 42 were pure fillers unrelated to any of the sub-experiments. 
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Procedure 


The experiment was conducted with Linger 2.94 (https://tedlab.mit.edu/~dr/Linger/), 
a sentence-processing experimental presentation program written by Douglas 
Rohde, using Apple Mac mini computers on Mac OS X and 17-inch LCD monitors. The 
program presented one sentence at a time on the monitor, left to right and region 
by region in a noncumulative, moving-window manner as a participant pressed the 
space bar (Just, Carpenter, and Woolley 1982). Each region roughly corresponded 
to a unit containing one free morpheme plus suffixal-bound morphemes (e.g., case 
markers, postpositions, and conjunctions). The program presented the materials of 
one list in a different pseudo-random order for each participant such that no two 
target items were presented consecutively. The participants were asked to read the 
sentences as naturally as possible. The experiment was preceded by brief instruc- 
tions and 10 practice items. Each stimulus sentence was immediately followed by 
a yes-no question regarding the content of the sentence, with visual feedback for 
wrong answers. 


3.2 Results 
Comprehension accuracy 


The mean accuracy rate of all items (including fillers and excluding practice items) 
was 81.2%, and the mean accuracy rate of the target items for this experiment was 
79.2%. The breakdown of the mean accuracy rate by conditions was as follows: 
(6a) Nom x Distant 79.4%; (6b) Only x Distant 76.5%; (6c) sika x Distant 71.1%; (6d) 
Nom x Local 82.8%; (6e) Only x Local 82.8%; (6f) sika x Local 77.0%; (6g) OnlyAff x 
Distant 78.9%; and (6h) OnlyAff x Local 85.3%. Numerically, the mean accuracy rate 
ofthe sika conditions was lower than that of the others (74.096 vs. 81.096); that of the 
Distant conditions was lower than that of the others (76.5% vs. 82.0%); that of the 
Only conditions was slightly higher than that of the others (80.996 vs. 77.696); and 
that of the OnlyAff conditions was higher than that of the others (82.1% vs. 78.3%). 
The fitted logistic regression model revealed neither the main effects of any of the 
factors nor significant interactions between them (all ps » .1). 


Reading times 


Data points beyond three standard deviations (SD) from the relevant condition x 
region cell mean were discarded to eliminate the outlier effects given noisy factors 
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such as lack of attention and sleepiness. Wrongly answered trials were trimmed 
for initial analyses. 

For statistical analyses, linear mixed effects (LME) models were fitted using the 
ImerTest package (which depends on 1me4) in the statistical software R (version 
3.6.2, 2019-12-12). We fitted the models with the Locality and dependency type 
factors as fixed effects. Deviation coding was used to code the main effects and 
interactions, with the Nom(inative) conditions and the Local conditions treated as 
baselines such that we could see, relative to the baselines, the effects of having 
dependencies with sika, dake (Only), or dake in an affirmative context (OnlyAff) 
and the effects of having long-distance dependencies (Locality). 

We also included the reading times in the pre-critical region as a fixed effect 
(labeled “spillover”) in the models (Vasishth and Lewis 2006) because the words in 
the region immediately preceding the critical region were not constant between 
the Local and Distant conditions, whose effects may have spilled over the response 
times (RTs) in the critical region. The values of this factor were centered and scaled 
before being built into the models because the values would, otherwise, be too dif- 
ferent in scale from the other fixed factors. Though models without this spillover 
factor eventually showed essentially similar results, we report the results of the 
models with the spillover factor. Participant and item intercepts were included in 
the model as random effects, except for the random slopes, as the model had too 
many factors, and the inclusion of random slopes prevented the model from reach- 
ing convergence. 

The results from the best-fitting LME model revealed the main effects of sika 
(t = 1.93, p = .054, with sika slower), Only (t = 4.35, p « .001, with Only slower), and 
OnlyAff (t = -7.75, p < .001, with OnlyAff faster) but no interactions (all |t|s < 0.6, all 
ps > .5). However, on dividing the participants into two groups per the comprehen- 
sion accuracy (CA) rates of the filler items, a different picture emerged. The mean 
raw RTs for the critical verb region in the data from the upper group (n = 26), whose 
CA rates were equal to or above the median (83.296), showed a tendency toward an 
interaction between the Locality and the sika factors (t = 1.87, p = .063) in such a 
direction that the distance did more harm to the sika than nominative conditions, 
whereas the lower group (n = 25) revealed an opposite tendency (t = -1.79, p = .074). 
No such effects were found in the comparison between the Nom conditions and 
the Only or OnlyAff conditions (all |t|s < 0.78, all ps > .43). Further, the lower group 
seems to have read the critical region much faster than the upper group (estimated 
intercepts: 784.5 [1137.3] ms for the lower [upper] group). Hence, good and poor 
readers (per the CA rates) may have had different strategies when processing the 
negative polarity dependencies. 

We conducted post hoc analyses of the data including participants’ comprehen- 
sion performances (centered and scaled) on the 96 filler items as a fixed effect in 
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the model to see if this tendency is statistically robust. We fitted an LME model to 
the data regardless of whether the trials were answered correctly because the data 
size would, otherwise, be skewed toward those of participants with higher accuracy 
rates. Table 2 shows a summary of the results from the best-fitting LME model? 
The analyses showed the main effects of sika (t = 3.04, p = .002) and Only (t = 3.57, 
p < .001) relative to the baseline nominative conditions, showing that these markers 
incurred extra cost. A main effect of OnlyAff was also found (t = -8.49, p < .001), 
which is a trivial finding because the verb region of the OnlyAff conditions lacked 
a negative morpheme. There was a strong main effect of CA rates (t = 5.09, p « .001) 
such that higher comprehension rates correlated with greater reading times. 

Despite no interaction of the Locality factor and the sika factor per se, there was 
a significant three-way interaction of Locality x sika x CA (t = 2.29, p = .022). Figure 1 
presents a scatter plot of the locality effects (the differences in the log-transformed 
RTs between the Distant and Local conditions) for the NPI (sika) conditions at the 
critical region, relative to the baseline nominative conditions, overlaid by regression 
lines for the NPI conditions and the Nom conditions to visually see the interaction 
trend. Regression analyses showed a positive correlation between locality effects and 
CA rates for the sika-marked conditions (t = 3.20, p = .002, r = .423) but no such cor- 
relation for the nominative conditions (t = 0.05, p = .963, r = .007). Figure 2 illustrates 
the contrast between good and poor readers defined as the upper quartile group 
(CA rate 85.996 or higher) and the lower quartile (78.496 or lower) under an extreme- 
groups design (cf. Conway et al. 2005: 782—783) in a visual summary of the mean RTs 
of all conditions. Statistically, the good readers (n = 13) showed a significant locality 
effect for sika (t = 2.23, p = .027), whereas the poor readers (n = 13) showed no such 
effect (t = —0.34, p = .733). No other terms reached significance (all ps > .1), except for 
predicted (and irrelevant) main effects of OnlyAff (ts < -3, ps < .001). 

There were also interactions between CA and Locality (t = 2.15, p = .032), suggest- 
ing that participants with higher CA rates were more careful in integrating long-dis- 
tant dependencies, and between CA and sika (t = 2.20, p = .028), suggesting that those 
readers were more sensitive to the presence of the NPI marker and its retrieval. 
An interaction between CA and OnlyAff, which shows a reverse trend (t = -3.06, 
p = .002), indicates that the magnitude of the facilitation effect given the absence of 
the negative morpheme (in the OnlyAff conditions) relative to its presence (in the 
Nom conditions) was greater when the CA rate was higher, possibly because good 
readers were more careful when processing negated sentences than poor readers. 


3 The model used was as follows: rt ~ Locality + sika + Only + OnlyAff + Locality:sika + Locali- 
ty:Only + Locality:OnlyAff + CA + CA:Locality + CA:sika + CA:Only + CA:OnlyAff + CA:Locality:sika + 
CA:Locality:Only + CA:Locality:OnlyAff + spillover + (1|subj) + (1 |item). 
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Table 2: Results of linear mixed effects model analysis for Experiment 1. 


Estimate SE t-value Pr(>|t|) 
(Intercept) 912.1 38.4 23.77 .000 *** 
Locality 14.8 13.8 1.07 285 
sika 72.5 23.9 3.04 .002 ** 
Only 85.6 24.0 3.57 .000 *** 
OnlyAff -202.7 23.9 -8.49 .000 *** 
CA 164.0 32.2 5.09 .000 *** 
spillover 27.6 14.4 1.92 .055. 
Locality:sika 12.1 23.9 0.51 .613 
Locality:Only -13.8 24.0 -0.58 565 
Locality: OnlyAff 12.6 23.9 0.53 597 
Locality:CA 29.9 13.9 2.15 .032 * 
sika:CA 53.6 244 2.20 .028 * 
Only:CA 43.5 24.5 1.78 .076 . 
OnlyAff:CA -74.9 24.5 -3.06 .002 ** 
Locality:sika:CA 55.1 24.1 2.29 .022 * 
Locality:Only:CA -36.6 24.3 -1.51 AER 
Locality:OnlyAff:CA -47 24.3 -0.19 848 


Signif. codes: 0 “***” .001 “**” .01 = 05 “>” 0.1 “” 1 
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Figure 1: Differences between the log-transformed response times of the Distant conditions 
and those of the Local conditions at the critical verb region, as a function of centered and scaled 
comprehension accuracies. 
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Figure 2: The mean raw response times of the critical region for the good readers (left) and the poor 
readers (right), with error bars representing 95% confidence intervals. 


3.3 Discussion 


Although locality effects per se were not found for sika, Only, or OnlyAff relative to 
the baseline nominative conditions, there was an interaction between the locality 
effects for the sika-marked conditions and the participants' CA rates: participants 
who scored better on comprehension questions showed stronger locality effects. 
CA rates also showed a strong main effect such that better readers tended to be 
slower, indicating that better readers were more careful in processing sentences. 
Further, there were no interactions between CA and Locality effects with the Only 
or OnlyAff conditions, even though these conditions were semantically comparable 
to the sika conditions. As noted, dake “only” does not call for a syntactic licenser. 
The presence of locality effects for sika and their absence for dake suggests that it 
was not the semantic computation of exclusivity but the setup of an extra depend- 
ency chain that incurred locality effects. 

One might be skeptical about all these conclusions because the results did 
not reveal straightforward locality effects for sika—we only found an interaction 
between locality effects and CA rates. Under this hypothesis, the processing of 
sika involves two kinds of effects that counter each other: the processing load for 
retrieving distant sika and the facilitation by an expectation for Neg. Thus, locality 
effects at the critical region would be observed only when the retrieval of the ante- 
cedent sika is costly enough to override the expectation-based facilitation effect 
(cf. Levy and Keller 2013). Hence, perhaps, we did not find straightforward locality 
effects because the distant-based retrieval cost was not large enough. In Experi- 
ment 2, we augment the retrieval cost at the critical region by making the distance 
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between NP-sika and Neg greater by adding a word. We may detect locality effects 
across all the participants if the distance-based cost was large enough. 


4 Experiment 2 


Experiment 2 was essentially identical to Experiment 1, except that the dependency 
distance in the Distant conditions was made one region greater by adding a locative 
adjunct to find more robust locality effects. 


4.1 Methods and materials 
Participants 


The participants comprised 77 native speakers of Japanese, mostly undergradu- 
ate students at the same university as in Experiment 1. None had participated in 
Experiment 1. They were paid 800 yen for their participation, which lasted approx- 
imately 30 minutes. 


Design and materials 


We adopted the same design, target and filler materials, and procedures as in Exper- 
iment 1, except that a locative adjunct was added to the embedded clause of each 
target item, making the distance between the subject and the critical verb in the 
Distant conditions (6a-c, g) greater by one region. For example, the sika x Distant 
condition (6c) was transformed into (7) below, with a locative adjunct (underlined) 
added immediately after the embedded subject: 


(7) sika x Distant 


tentyoo sika /ueetoresu ga ` [tennai de [zyoorenkyaku o 
manager SIKA /waitress NOM /insideshop at /regularcustomer ACC 
/nagut-ta to / sinzi-nakat-ta node Tus 


/hit-PAST COMP /believe-NEG-PAST because /... 
*Because nobody but the manager believed that the waitress hit the 
regular customer in the restaurant, . . ." 
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Unlike most English prepositions, Japanese postpositions have no adnominal/ 
adverbial ambiguity; the locative phrases used in this experiment were always 
unambiguously adverbial. This extra phrase was added in the same position 
Oe, immediately after the embedded subject) in all the other seven conditions. 


Procedure 


The procedure and filler items of Experiment 2 were identical to that of Experi- 
ment 1. 


4.2 Results 
Comprehension accuracy 


The mean accuracy rate of all items (including fillers and excluding practice 
items) was 78.296, and the mean accuracy rate of the items for this experiment was 
74.496. The breakdown of the mean accuracy rate by conditions was as follows: 
Nom x Distant 71.896; Only x Distant 69.596; sika x Distant 66.996; Nom x Local 75.096; 
Only x Local 78.696; sika x Local 75.6%; OnlyAff x Distant 79.9%; and OnlyAff x 
Local 78.6%. The fitted logistic regression model did not reveal any main effects or 
interactions (all ps > .2). 


Reading times 


The statistical analyses for Experiment 2 followed those of Experiment 1. When we 
analyzed the correctly answered data points (within 3 SDs of the relevant condition 
x region cell mean) without considering participants’ comprehension performances, 
we found the main effects of Only (t = 2.89, p = .0039, with Only slower), and OnlyAff 
(t=-7.13, p< .001, with OnlyAfffaster) but no main effect of sika (t= 1.30, p=.196); there 
was a weak tendency toward a locality effect with sika (t = 1.77, p = .077) but not with 
Only or OnlyAff (|t|s < 1.2, ps > .2). As in Experiment 1, we found a similar contrast 
between the two groups divided by the CA rates for filler items at the median (79.2%): 
the upper group (n = 41) showed a locality effect for sika (t = 3.17, p = .002), while the 
lower group (n = 36) showed an opposite tendency (t = 1.70, p = .090). Thus, we re-fit- 
ted the model with participants' centered and scaled CA rates and relevant interac- 
tions to the data (irrespective of whether they were correctly answered). Table 3 
summarizes the results from the best-fitting LME model. The analyses showed main 
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effects of Only (t = 3.00, p = .003), OnlyAff (t = -7.63, p < .001), CA (t = 4.19, p < .001), and 
spillover (t = 6.70, p < .001). There was also an interaction of CA and OnlyAff (t = -2.79, 
p = .005). 

There was a strong main effect of CA such that participants with higher CA 
rates were slower at reading the critical region (t = 4.19, p < .001). More impor- 
tantly, we found a significant three-way interaction of CA x Locality x sika (t = 2.60, 
p = .009). Figure 3 shows the scatter plot of the locality effects for the NPI (sika) 
conditions at the critical region, relative to the baseline nominative conditions. 
Regression analyses revealed a positive correlation between locality effects and CA 
for the sika-marked conditions (t = 3.96, p < .001, r = .423) but no such correlation for 
the nominative conditions (t = 0.50, p = .619, r = .059). As for good and poor readers 
defined as the upper (CA rate 84.4% or higher) and lower (75.8% or lower) quartile 
groups, the good readers (n= 22) showed a significant locality effect for sika (t = 3.64, 
p « .001), and the poor readers (n = 20) showed no such effect (t = -0.25, p = .801). 
Figure 4 summarizes the contrast between good and poor readers regarding the 
mean RTs at the critical region. 


Table 3: Results of linear mixed effects model analysis for Experiment 2. 


Estimate SE t-value Pr(>|t|) 
(Intercept) 705.2 24.3 29.07 .000 *** 
Locality 3.6 TE 0.47 .637 
sika 18.7 132 1.41 .158 
Only 39.8 132 3.00 .003 ** 
OnlyAff -101.2 13.3 -7.63 .000 *** 
CA 93.2 22.2 4.19 .000 *** 
spillover 62.8 9.4 6.70 .000 *** 
Locality:sika 17.9 13.2 1.36 175 
Locality:Only -13.6 13.3 -1.02 .307 
Locality:OnlyAff -10.5 13.3 -0.79 .431 
Locality:CA 13.6 74 1.76 .078 
sika:CA -0.8 13.4 -0.06 .954 
Only:CA 9.8 13.4 0.73 .465 
OnlyAff:CA -37.5 13.4 -2.79 .005 ** 
Locality:sika:CA 34.9 13.4 2.60 .009 ** 
Locality:Only:CA -9.5 13.4 -0.72 475 
Locality:OnlyAff:CA -15.4 13.4 -1.15 .251 


Signif. codes: 0 “***” .001 “**” .01 **" .05 “>” 0.1 “” 1 
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Figure 3: Differences between the log-transformed response times of the Distant conditions 
and those of the Local conditions at the critical verb region, as a function of centered and scaled 
comprehension accuracies. 
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Figure 4: The mean raw response times of the critical region for the good readers (left) and the poor 
readers (right), with error bars representing 9596 confidence intervals. 


4.3 Discussion 


Experiment 2 yielded similar results to Experiment 1, except for a weak indication of 
locality effects for sika. However, as in Experiment 1, there was a significant interac- 
tion between locality effects in the sika conditions and CA rates in such a direction that 
locality effects tended to be stronger when CA rates were higher. It suggests the possi- 
bility that the processing of NP-sika may be a function of reading comprehension skills 
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because NPI-marker sika introduces an extra dependency that adds to memory load. 
The finding that the semantically comparable conditions with dake (Only / OnlyAff) did 
not match the sika conditions suggests that the obligatory setup of a new grammatical 
relation, not the semantic exclusivity per se, invoked a decay-based locality effect. 

Although Experiments 1 and 2 yielded similar results, there was one clear con- 
trast: the critical verb region was read faster in Experiment 2 than in Experiment 1 
(703 ms vs. 909 ms), a strong indication of position effects (Ferreira and Henderson 
1993). Recall that the target items in Experiments 1 and 2 were identical, except that the 
latter had one extra element (locative PP), which pushed the critical region one region 
away in the latter. The filler items were identical. We conducted a between-participants 
meta-analysis of the results of Experiments 1 and 2 combined (128 participants), using 
a model including the Experiment factor as a fixed effect to all the relevant factors. 
Table 4 summarizes the results, revealing a significant facilitation effect of the Exper- 
iment factor (t = —5.36, p « .001). Thus, as per several studies, varying the dependency 
distance by simply adding an intervening word is not appropriate for testing memo- 
ry-based locality effects. The results confirmed a robust CA x Locality x sika interaction 
(t = 3.41, p = .001); no such interaction was found with other Dependency Types. 


Table 4: Results of the meta-analysis of the combined results of Experiments 
1and 2 using a linear mixed effects model. 


Estimate SE t-value  Pr(»|t|) 
(Intercept) 807.5 23.0 35.14 .000 *** 
Experiment -101.9 19.0 -5.36 .000 *** 
Locality 7.5 7.2 1.04 .298 
sika 41.0 12.5 3.28 .001 ** 
Only 58.1 12.5 4.64 .000 *** 
OnlyAff -141.7 12.5 11.32 .000 *** 
CA 121.5 18.8 6.46 .000 *** 
spillover 46.4 8.2 5.65 .000 *** 
Locality:sika 16.4 12.5 1.31 .190 
Locality:Only -14.4 12.5 -1.15 .252 
Locality:OnlyAff -0.8 12.5 -0.06 .950 
Locality:CA 21.0 7.3 2.88 .004 ** 
sika:CA 20.1 12.7 1.59 .112 
Only:CA 24.6 12.6 1.95 051. 
OnlyAff:CA -52.3 12.7 -4.13 .000 *** 
Locality:sika:CA 43.1 12.6 3.41 .001 *** 
Locality:Only:CA -20.6 12.6 -1.63 102 
Locality:OnlyAff:CA -12.6 12.7 -1.00 .320 


Signif. codes: 0 “***” .001 “**” .01 “*” .05 “>” 0.1 “” 1 
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5 General discussion 


This study tested the hypothesis that locality effects are a function of dependency 
length and type. Adopting activation-based working memory retrieval models (Thi- 
badeau, Just, and Carpenter 1982; King and Just 1991; Vasishth and Lewis 2006), 
we assumed that non-thematic dependencies are more prone to memory decay 
when dependency lengths are greater because they tend to be linearly discontin- 
uous and, thus, do not receive maintenance support from intervening elements 
in working memory. We conducted two self-paced reading experiments to test 
whether NPI-marker sika would invoke a locality effect relative to its nominative 
control. In Experiment 1, there was no interaction of sika and Locality; in Experi- 
ment 2, the dependency length in the distance conditions was greater by one word, 
though we only found a marginal tendency toward an interaction. However, when 
we included participants’ CA rates in the models, both experiments yielded a sig- 
nificant three-way interaction of sika, Locality, and participants’ CA rates, such 
that better readers tended to show greater locality effects Oe, longer reading 
times when sika was distant). Such an interaction was not found with semantically 
comparable dake “only,” suggesting that the requirement for polarity triggered by 
sika incurred an extra complexity that selectively affected good readers. However, 
why was the locality effect with sika a function of comprehension performance? 
Prior findings suggest interactions between working memory capacities and 
reading behaviors. It is known that individual differences in working memory 
capacity induce differences in reading times and CAs, interacting with structural 
factors (Just and Carpenter 1992; King and Just 1991; MacDonald, Just, and Carpen- 
ter 1992). Nicenboim et al. (2016) find an interaction between locality effects and 
individual working memory capacities such that locality effects were greater with 
high-capacity readers, while an anti-locality trend was found with readers with 
lower working capacity. They conjecture that this interaction is an indication of 
forgetting effects (Gibson and Thomas 1999): low-capacity readers tended to lose 
track of longer dependencies, failing to integrate them, thus failing to show local- 
ity effects. MacDonald, Just, and Carpenter (1992) probe the processing complexity 
of ambiguous sentences and report that high-span readers showed longer reading 
times than low-span readers. They also find an interaction of the capacity factor 
and the ambiguity factor such that the slowdown effects of temporal ambiguity 
were greater with high-span readers. They conclude that high-span readers could 
maintain multiple structural analyses for a longer period than low-span readers.* 


4 King and Just (1991) probe the interaction between working memory capacity and the process- 
ing complexity of relative clauses (subject- vs. object-extraction) and report seemingly opposite 
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Many previous studies report that the working memory measure is highly corre- 
lated with general reading skill measures, including CAs. 

Thus, we conjecture that poor readers likely to have lower working memory 
capacity are more likely to lose track of multiple dependencies (cf. MacDonald 
et al. 1992). Note the hypothesis that the presence of NPI-marker sika would intro- 
duce an extra dependency. Assumedly, the distance-based retrieval cost for sika 
was found only with good readers because they could successfully keep multi- 
ple dependencies in working memory. These findings are highly compatible with 
Nicenboim et al. (2016), where low working memory capacity readers tended 
to show anti-locality effects, while high-capacity readers tended to show local- 
ity effects. The results are also compatible with MacDonald et al. (1992), where 
high-capacity readers could track multiple analyses of locally ambiguous sen- 
tences and, thus, were slower than low-capacity readers. Although comprehension 
question accuracy cannot be considered directly reflective of working memory 
capacity, working memory capacity is highly correlated with various reading skills 
(Just and Carpenter 1992; King and Just 1991; MacDonald et al. 1992). It is meas- 
ured by complex tasks (see Conway et al. 2005 for review) and, thus, is regarded as 
an attention-inhibition component and a storage component (Conway and Engle 
1994; Engle et al. 1999). Keeping track of multiple distinct dependencies may be 
comparable to complex memory tasks. We assume individual working memory 
capacities and individual comprehension performances affect reading behaviors 
in the same direction, but the validity of this claim needs further examination, 
which is left open for future research. 

One final issue that should be addressed is the question of what type of expec- 
tation for a dependency would incur a decay-driven retrieval cost. For example, 
Husain, Vasishth, and Srinivasan (2014) tested the interaction of locality effects and 
the strength of expectation in Hindi. They employed idiomatic noun-verb combina- 
tions, such as khayaal rakhnaa “(lit.) care keep” = “take care of,” against non-idio- 
matic combinations, such as gitaar rakhnaa *guitar keep," to vary the expectation 
factor. They found anti-locality effects when the noun triggered a strong expecta- 
tion for a specific verb (khayaal . . . rakhe) but not when it did not (gitaar . . . rakhe). 
Thus, the strong expectation for a specific verb worked in the opposite direction 
to the grammatical expectation for a Neg triggered by an NPI in our experiments. 
It indicates that the expectation based on a fixed complex expression is qualita- 
tively different from the expectation for Neg triggered by an NPI. In the former 


results to those of Nicemboim et al. (2016) and MacDonald et al. (1992), such that low-span readers 
showed greater reading times with object-extracted relative clauses, assumed to be structurally 
more complex than subject-extracted ones. Interestingly, King and Just (1991) also reported that 
this effect was absent with *non-readers" whose CA rates were at chance levels. 
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case, the expectation is thematic: when khayaal “care” is encountered, the thematic 
interpretation "take care of" is immediately established; in this sense, this *expec- 
tation" is part of the already present thematic chain. However, the computation of 
the NPI-Neg dependency hinges upon the completion of the thematic computation 
ofthe proposition. It may explain why the latter type of dependency, not the former, 
poses some working memory load and may incur locality effects. 

One may conclude that the qualitative difference between the expectation 
based on idiomatic noun-verb complex expressions on the one hand and that based 
on NPI-Neg dependencies as well as wh-gap dependencies on the other hinges on the 
presence (absence) of a syntactic feature (i.e., the latter dependencies incur locality 
effects because they involve formal syntactic checking operations). However, Naka- 
tani (2021) reports locality effects of maximally positive adverbials marked with 
contrastive marker wa (e.g., hakkirito-wa *clearly"), which behave like a negative 
polarity item. It is not very plausible to assume that these adverbial *pseudo-NPIs" 
involve a formal syntactic NPI feature that must be checked because it is not com- 
pletely ungrammatical for them to appear in an affirmative context without Neg (see 
Nakatani 2021 for details). Therefore, the type of dependency that counts as an extra 
dependency is not exclusively limited to formal syntactic feature-checking relations. 
Further research is needed to explicate what type of expectation for a head to come 
leads to the establishment of *multiple" dependencies that may incur locality effects. 


References 


Anderson, John R., Daniel Bothell, Michael D. Byrne, Scott Douglass, Christian Lebiere, & Yulin Qin. 
2004. An integrated theory of the mind. Psychological Review 111(4). 1036. 

Bartek, Brian, Richard L. Lewis, Shravan Vasishth, & Mason R. Smith. 2011. In search of on-line locality 
effects in sentence comprehension. Journal of Experimental Psychology: Learning, Memory, and 
Cognition 37. 1178-1198. 

Chomsky, Noam & George A. Miller. 1963. Introduction to the formal analysis of natural languages. In 
R. Duncan Luce, Robert R. Bush, & Eugene Galanter (eds.), Handbook of Mathematical Psychology, 
vol. 2, 269-321. New York, NY: Wiley. 

Conway, Andrew R. & Randall W. Engle. 1994. Working memory and retrieval: A resource-dependent 
inhibition model. Journal of Experimental Psychology: General 123(4). 354-373. 

Conway, Andrew R., Michael J. Kane, Michael F. Bunting, D. Zach Hambrick, Oliver Wilhelm, & Randall 
W. Engle. 2005. Working memory span tasks: A methodological review and user's guide. 
Psychonomic Bulletin and Review 12. 769-786. 

Daneman, Meredyth & Patricia A. Carpenter. 1980. Individual differences in working memory and 
reading. Journal of Verbal Learning and Verbal Behavior 19. 450-466. 

Daneman, Meredyth & Philip M. Merikle. 1996. Working memory and language comprehension: A 
meta-analysis. Psychonomic Bulletin and Review 3. 422-433. 


Chapter 3 Locality-based retrieval effects are dependent on dependency type — 53 


Engle, Randall W., James E. Laughlin, Stephen W. Tuholski, & Andrew R. Conway. 1999. Working 
memory, short-term memory, and general fluid intelligence: A latent-variable approach. Journal 
of Experimental Psychology: General 128. 309-331. 
Ferreira, Fernanda & John M. Henderson. 1993. Reading processes during syntactic analysis and 
reanalysis. Canadian Journal of Experimental Psychology 47. 247-275. 
Gibson, Edward. 1998. Linguistic complexity: Locality of syntactic dependencies. Cognition 68. 1-76. 
Gibson, Edward. 2000. The dependency locality theory: A distance-based theory of linguistic 
complexity. In Alec Marantz, Yasushi Miyashita, & Wayne O'Neil (eds.), Image, Language, Brain: 
Papers from the First Mind Articulation Project Symposium, 95-126. Cambridge, MA: MIT Press. 
Gibson, Edward & James Thomas. 1999. Memory limitations and structural forgetting: The perception 
of complex ungrammatical sentences as grammatical. Language and Cognitive Processes 14. 
225-248. 
Grodner, Daniel J. & Edward Gibson. 2005. Consequences of the serial nature of linguistic input for 
sentential Complexity. Cognitive Science 29. 261-291. 
Husain, Samar, Shravan Vasishth, & Narayanan Srinivasan. 2014. Strong expectations cancel locality 
effects: Evidence from Hindi. PLOS ONE 9(7). 1-14. 
Just, Marcel A. & Patricia A. Carpenter. 1992. A capacity theory of comprehension: Individual 
differences in working memory. Psychological Review 99(1). 122-149. 
Just, Marcel A., Patricia A. Carpenter, & Jacqueline D. Woolley. 1982. Paradigms and processes in 
reading comprehension. Journal of Experimental Psychology: General 111. 228-238. 
King, Jonathan & Marcel A. Just. 1991. Individual differences in syntactic processing: The role of 
working memory. Journal of Memory and Language 30(5). 580-602. 
Konieczny, Lars. 2000. Locality and parsing complexity. Journal of Psycholinguistic Research 29. 627-645. 
Konieczny, Lars & Philipp Dóring. 2003. Anticipation of clause-final heads: Evidence from eye-tracking 
and SRNs. Proceedings of the 4th International Conference on Cognitive Science, 330-335. Sydney: 
University of New South Wales. 
Ladusaw, William. 1979. Polarity Sensitivity as Inherent Scope Relations. New York, NY: Garland 
Publishing. 
Levy, Roger, Evelina Fedorenko, & Edward Gibson. 2013. The syntactic complexity of Russian relative 
clauses. Journal of Memory and Language 69. 461-495. 
Levy, Roger & Frank Keller. 2013. Expectation and locality effects in German verb-final structures. 
Journal of Memory and Language 68. 199-222. 
MacDonald, Maryellen C., Marcel A. Just, & Patricia A. Carpenter. 1992. Working memory constraints 
on the processing of syntactic ambiguity. Cognitive Psychology 24(1). 56-98. 
Miyagawa, Shigeru, Nobuaki Nishioka, & Hedde Zeijlstra. 2016. Negative sensitive items and the 
discourse-configurational nature of Japanese. Glossa 1: 33. 1-28. 
Nakatani, Kentaro. 2021. Locality effects in the processing of negative-sensitive adverbials in Japanese. 
In Reiko Okabe, Jun Yashima, Yusuke Kubota, & Tatsuya Isono (eds.), The Joy and Enjoyment of 
Linguistic Research: A Festschrift for Takane Ito, 462-472. Tokyo: Kaitakusha. 
Nakatani, Kentaro & Edward Gibson. 2008. Distinguishing theories of syntactic expectation cost in 
sentence comprehension: Evidence from Japanese. Linguistics 46. 63-87. 
Nakatani, Kentaro & Edward Gibson. 2010. An on-line study of Japanese nesting complexity. Cognitive 
Science 34. 94-112. 
Nicenboim, Bruno, Pavel Logacev, Carolina Gattei, & Shravan Vasishth. 2016. When high-capacity 
readers slow down and low-capacity readers speed up: Working memory and locality effects. 
Frontiers in Psychology 7. 1-24. 


54 —— Kentaro Nakatani 


Ono, Hajime & Kentaro Nakatani. 2014. Integration costs in the processing of Japanese wh-inter- 
rogative sentences. Studies in Language Sciences 13. 13-31. 

Osaka, Mariko & Naoyuki Osaka. 1992. Language-independent working memory as measured by 
Japanese and English reading span tests. Bulletin of the Psychonomic Society 30. 287-289. 

Phillips, Colin, Nina Kazaninaa, & Shani H. Abada. 2005. ERP effects of the processing of syntactic 
long-distance dependencies. Cognitive Brain Research 22. 407-428. 

Safavi, Molood S., Samar Husain, & Shravan Vasishth. 2016. Dependency resolution difficulty 
increases with distance in Persian separable complex predicates: Evidence for expectation and 
memory-based accounts. Frontiers in Psychology 7: 403. 1-15. 

Thibadeau, Robert, Marcel A. Just, & Patricia A. Carpenter. 1982. A model of the time course and 
content of reading. Cognitive Science 6. 157-203. 

Van Dyke, Julie A. & Richard L. Lewis. 2003. Distinguishing effects of structure and decay on 
attachment and repair: A retrieval interference theory of recovery from misanalyzed ambiguities. 
Journal of Memory and Language 49. 285-413. 

Vasishth, Shravan & Heiner Drenhaus. 2011. Locality in German. Dialogue and Discourse 1. 59-82. 

Vasishth, Shravan & Richard L. Lewis. 2006. Argument-head distance and processing complexity: 
Explaining both locality and antilocality effects. Language 82. 767-794. 

Von Fintel, Kai. 1999. NPI licensing, Strawson entailment, and context dependency. Journal of Semantics 
16. 97-148. 

Yngve, V.H. 1960. A model and a hypothesis for language structure. Proceedings of the American 
Philosophical Society, 104, 444-466. 

Yoshida, Masaya. 2002. When negative statements are easier: Processing of polarity items in 
Japanese. In Tetsuya Sano, Mika Endo, Miwa Isobe, Koichi Otaki, Koji Sugisaki, & Takeru Suzuki 
(eds.), An Enterprise in the Cognitive Science of Language: A Festschrift for Yukio Otsu, 585-598. 
Tokyo: Hituzi Syobo. 


Shingo Tokimoto and Naoko Tokimoto 


Chapter 4 

An EEG analysis of long-distance scrambling 
in Japanese: Head direction, reanalysis, 

and working memory constraints 


1 Introduction: Constraints on discontinuous 
dependency 


In a natural language sentence, two morphemes discontinuous in a time series can 
establish a semantically closer relationship than their adjacent morphemes. The 
words underlined in (1) are some examples. 


(1) a. Ifyou don't feel well, then you can go home right away. 
b. Idon'twant anybody to disturb me. 
c. A review came out yesterday of this book. 
d. The woman who you were talking about is my sister. 


Discontinuous dependency is cross-linguistic. In the Japanese examples in (2), the 
underlined words establish discontinuous dependency (the abbreviations -NOM, 
-TOP, and -ACC mean nominal, topic, and accusative, respectively). 


(2) a. If... then 
Moshi netsu-ga aru-nara, kaette ` yasumi-nasai. 
If fever-NOM eist (hen gohome to rest 
“If you have a fever, then you should go home to rest." 
b. Negative polarity item 
Sofu-wa joobu-de ichido-shika kaze-o 
grandfather-TOP strong oonceonly ` cold-ACC 
hiita-koto-ga nai. 
had fact-NOM not 
“My grandfather is so strong that he has had a cold only once.” 


[o] Open Access. © 2024 the author(s), published by De Gruyter. | C9 TEXTE This work is licensed under the Creative 
Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. 
https://doi.org/10.1515/9783110778939-004 
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c. Quantifier floating 
Chichi-wa | san-bai asagohan-ni coffee-o nomimashita. 
father-TOP three breakfast at coffee-ACC drank 
“My father drank three cups of coffee at breakfast.” 


Discontinuous dependency is evidence that the set of sentences of a natural language 
exceeds the generative capacity of finite-state grammar. This phenomenon is, thus, 
critical for the study of the computational nature of natural language, and research- 
ers on syntax and sentence processing have intensively discussed the dependency. 

Discontinuous dependency can cross the clause boundaries of verbal com- 
plement declaratives. In (3), for example, what can be interpreted as the object of 
bought, and the number of the embeddings is unlimited in principle (S represents 
*sentence"). 


(3 a. What do you know that [s John bought yesterday]? 
b. What do you know that [s Bill thinks that [; John bought yesterday]]? 
c. What do you know that [s George believes that [s Bill thinks that [; John 
bought yesterday]]]? 


However, it is well known that discontinuous dependency cannot always cross 
clause boundaries. In (4), for example, what cannot be interpreted as the object 
of bought. The examples in (4) indicate that discontinuous dependency is affected 
by syntactic environments. A complex noun phrase (NP) in (4a,b) and an adverbial 
adjunct clause in (4c) interfere with the dependencies (PP is the abbreviation for 
*prepositional phrase"). 


(4) a. *What do you know [ypthe rumor that [s John bought yesterday]]? 
b. *What do you know [ypthe dealer where [; John bought yesterday]]? 
c. "What was Bill reading a magazine [pp when [s John bought yesterday]]? 


The constituent that blocks discontinuous dependency is called the syntactic island 
in linguistic studies. A complex NP and an adverbial adjunct clause are examples of 
syntactic islands in English. 

The word order in a Japanese sentence is relatively free; therefore, we can 
discuss the discontinuous dependencies in Japanese corresponding to those in the 
English examples by preposing subordinate objects by scrambling. The preposed 
sono hon-o (that book-ACC) in (5) can be interpreted as the object of katta (bought). 
The dependency crosses the clause boundary of a verbal complement declarative 
in the same manner as in (3). 


(5) 
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Japanese discontinuous dependency crossing the boundary of a verbal com- 
plement declarative. 


Sono hon-o Hanako-ga [s Taroo-ga katta]-to omotteiru. 
that book-ACC name-NOM  name-NOM bought-COMP think 
*That book, Hanako thinks that Taroo bought." (Saito 1992) 


The island phenomenon is assumed to be cross-linguistic, though syntactic catego- 
ries that constitute islands can vary among languages (Goodluck and Rochemont 
1992). Note that the effect of possible islands in Japanese sentences is mild, whereas 
the island effect in English is salient. Kuno (1973), for example, considers (6) to be 
marginal, with the subordinate object preposed from the inside of a complex NP. 


(6) 


Japanese discontinuous dependency crossing the boundary of the nominal 
complement declarative in a complex NP. 

?Saburoo-o  Taroo-wa [np [s Hanako-ga | nikundeiru]-toiu 

name-ACC name-TOP name-NOM hate-COMP 

uwasa]-o shinziteita. 

rumor-ACC believed 

“As for Saburoo, Taro believed the rumor that Hanako hated him,.” 


Furthermore, sentences in (7) are considered grammatical, with one of the discon- 
tinuous elements in a complex NP in (7a) and in an adverbial adjunct clause in (7b). 


(7) 


a. Japanese discontinuous dependency crossing the boundary of the nominal 

complement declarative in a complex NP. 

Sono hon-o Jon-ga [np [sMary-ga katta]-toiu 

thatbook-ACC John-NOM Mary-NOM bought-COMP 

uwasa]-o kiita. 

rumor-ACC heard 

“(As for) that book, John heard the rumor that Mary had bought (it).” 

(Nakamura 2001) 

b. Japanese discontinuous dependency crossing the boundary of an adverbial 

adjunct clause. 

Bungakubu-ni Taro-wa [pp [s Jiroo-ga 

faculty of letters-DAT name-TOP name-NOM 

nyuugakushita]-node] odoroita. 

entered-because got surprised 

“(As for) a faculty of letters, Taro got surprised because Jiroo had entered (it)." 

(Mihara 2000) 
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Sprouse, Wagers, and Phillips (2012) presented English sentences to native speakers 
of English and asked them to evaluate their acceptability by choosing one of the 
seven scales. In generating experimental sentences, they manipulated two factors 
i.e., the presence or absence of discontinuous dependency and that of a construc- 
tion that could function as a syntactic island), keeping the propositional meaning of 
a subordinate clause unchanged as much as possible. Sprouse, Wagers, and Phillips 
(2012) succeeded with this manipulation in dividing the island effect into the effect 
of the discontinuous dependency and that of the presence of island construction. 
The island effect was statistically evaluated as the interaction of the two factors, 
and the acceptability of an island construction was relatively evaluated by the ref- 
erence to the acceptability of other constructions. Tokimoto (2019) followed the 
method of Sprouse, Wagers, and Phillips (2012) to compare the (possible) island 
effect in Japanese with the island effects in English. Accordingly, (8a) is one of the 
stimulus sentences in Sprouse, Wagers, & Phillips (2012), and (8b) is its counterpart 
in Tokimoto (2019). 


(8 a. What did the chef hear the statement that [Jeff baked]? 


b. HS, FH, [RH SA 2š For] ae 
what-ACC chef-TOP Mr/Ms.Okuda-NOM baked statement-ACC 
Bu A692? 


heard-interrogative 


Tokimoto (2019), unlike Sprouse, Wagers, and Phillips (2012), observed no significant 
interaction between the two factors for possible island constructions in Japanese 
Oe, adverbial adjunct clauses, complex noun phrases, and indirect questions). 
Note a typological difference in the processing time course of (possible) island 
constructions between English and Japanese, schematically shown in Figure 1 for 
the examples in (8). In Figure 1A, for a complex NP in English, the presence of a 
biclausal dependency is recognized at the head noun statement, and the syntactic 
relationship between what and its counterpart to come (baked) is also recognized 
here. For a complex NP in Japanese in Figure 1B, however, the processing time 
course can be divided into two steps. That is, the presence of a biclausal depend- 
ency is recognized at the subordinate subject DI SA 28 (Mr/Ms. Okuda-NOM) 
because no Japanese verb takes three consecutive NPs case-marked as -o (-ACC), 
-wa (-TOP), and -ga (-NOM) as its arguments. The syntactic relationship between 
fa & (what-ACC) and the subordinate verb $i.» 7c (baked) is recognized at the 
head noun Zë (statement) of the complex NP. Hence, the effects of establishing 
discontinuous dependency and syntactic computation between the discontinuous 
elements can be confounded in English. In Japanese sentences, we can experi- 
mentally examine these two processes independently given the head-final nature. 
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This chapter examines the possibility that the difference in the time course of 
sentence processing between English and Japanese can affect the degree of island 
effects. The basic assumption is stated in (9). 


A. A complex NP in English 
*What did the chef hear the statement that [Jeff baked]? 


Recognition of biclausal dependency and syntactic relationship 
B. A complex NP in Japanese 
fal, PHERI, (RA SAD Bete] she Buck A CHA? 


Recognition of biclausal dependency 


Recognition of syntactic relationship 


Figure 1: Typological difference in the processing time course of biclausal discontinuous dependency 
between English and Japanese. 


(9) Basic assumption for linguistic judgments 
A linguistic judgment is a result of language processing in real-time (like a visual 
illusion). The time course of a sentence processing can, thus, affect grammatical 
judgment. 


We discuss the electroencephalogram (EEG) associated with the processing of possi- 
ble syntactic islands in Japanese to examine its time course in detail. 

McKinnon and Osterhout (1996) examine the event-related potential (ERP) 
linked with the violation of syntactic island constraints in English. They visually 
presented (10) word by word and observed a late positivity (P600) for when in (10b) 
against (10a) in the centro-parietal and occipital regions. 


(10) a. I wonder whether the candidate was annoyed [when his son was 
questioned by his staff member] 
b. *I wonder which of his staff members; the candidate was annoyed 
[when his son was questioned by el 


However, some researchers attribute island constraints to the constraints of working 
memory (Kluender 1998). From this theoretical standpoint, a sentence is assumed to 
be ungrammatical when the processing load of linguistic forms between discontinu- 
ous elements exceeds a threshold, and, thus, the access to the antecedent of the filler 
at a gap becomes challenging, with the assumption that processing and retention of 
information share a single resource (Just and Carpenter 1992). Accordingly, Kluender 
and Kutas (1993) visually presented sentences of (11) word by word and observed left 
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anterior negativity (LAN) for the in (11c) relative to (11a). The indirect question in 
(11c) is one of the syntactic islands in English, and the anterior negativity is claimed 
to be a manifestation of an additional working memory load. 


(11) a. Verbal complement declarative 

Who; has she forgotten [that THE boss referred that matter to e; for further 
study]? 

b. Verbal complement if-clause 
Who has she forgotten [if THE boss referred that matter to e; for further 
study]? 

c. Indirect question 
*Who, has she forgotten [what; THE boss referred e;to e; for further study]? 


The processing point of the recognition of discontinuous dependency and that of the 
syntactic computation of the discontinuous elements are different in Japanese, as in 
Figure 1B. Therefore, apparently contradicting findings in English can be reevaluated by 
thereference to the findings in Japanese. The research questions are enumerated in (12). 


(12) a. Can we find different neural activities corresponding to the different 
head directions in English and Japanese? 
b. Can we find a correspondence between expected ERPs and the processing 
contents? 
c. Canwe attribute the relative weakness of the island effect in Japanese to 
its typological properties in sentence processing? 


In the following sections, we discuss our experiment to record the EEG elicited by 


Japanese sentences with the long-distance scrambling of the subordinate object 
from a verbal complement or an adverbial adjunct clause. 


2 Experiment 

2.1 Method 

2.1.1 Participants 

Twenty-two native speakers of Japanese between 20 and 42 years old (M = 23.5 


years SD = 5.49 years, 15 females) participated in this study for payment. The par- 
ticipants had normal or corrected-to-normal vision and had no history of neuro- 
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logical or psychiatric disorders. All the participants were right-handed, as per a 
handedness questionnaire (Oldfield 1971). This study was approved by the ethics 
committee of Mejiro University. Written informed consent was obtained from each 
participant. 


2.1.2 Materials 


We generated experimental sentences of six phrases with three within factors 
manipulated, as in (13). 


(13 a. Type of subordinate clause: Verbal complement declarative (Comp) or 
adverbial adjunct clause (Adjunct) 
b. Word order: Canonical or scrambled 
c. Subordinate subject: Proper name or the first-person singular pronoun 


The syntactic status of a subordinate clause was manipulated by changing the par- 
ticle at the end of a subordinate clause to -to (that) for a verbal complement declar- 
ative or node (because) for an adverbial adjunct clause. Adverbial adjunct clauses 
were intended to be a possible syntactic island in Japanese relative to a non-island 
verbal complement declarative. Discontinuous dependency was implemented by a 
long-distance scrambling of a subordinate object to the sentence-initial position. We 
manipulated the processing load on working memory by changing the subordinate 
subject to be a proper name or the first-person singular pronoun watashi with the 
assumption that a proper noun is more costly than the first-person pronoun. The rel- 
atively light processing load of a pronoun can be observed in a sentence with center 
embeddings. A doubly center-embedding is widely known to cause processing dif- 
ficulty, but some researchers have recognized that center-embedded sentences are 
relatively easy to process when the most embedded NP is a first- or second-person 
pronoun, as in (14b-d) against (14a). According to Warren and Gibson (2002), inte- 
grating a new word across linguistic material indicating a new discourse referent 
is more costly than integrating a word across linguistic material, referring back to a 
pre-existing discourse referent. In (14a), for example, two discontinuous subject-verb 
relationships are included—nanny . . . was adored and the agency... sent—and the 
subjects must be retrieved at the input of their corresponding verbs with the most 
deeply embedded subject—the neighbors—intervening between the two discontinu- 
ous dependencies as a new discourse referent. However, in (14b-d), the most deeply 
embedded subjects J in (14b,d) and you in (14c) intervene between the subsect-verb 
relationships, but a first- or a second-person pronoun does not build a new discourse 
referent because the referents of J and you are assumed to be a default part of the 
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domain of discourse, even in a null context, given that every discourse has a speak- 
er-writer and a listener-reader (Kamp and Reyle 1993). Thus, (14b-d) are easier to 
process than (14a) because greater mental resources are available for the former 
than the latter to retrieve the subjects at the verbs. 


(14) a. Thenanny who the agency which the neighbors recommended sent was 

adored by all the children. (Warren and Gibson 2002) 

b. The reporter who everyone that I met trusts said the president won't 
resign 
yet. (Bever 1974) 

c. Isn't it true that example sentences that people that you know produce 
are more likely to be accepted? (De Roeck et al. 1982) 

d. A book that some Italian I've never heard of wrote will be published 
soon by MIT Press. (Frank 1992) 


In our experimental scrambled sentences, the matrix subjects intervene between 
the preposed subordinate objects and their corresponding subordinate verbs. 
Therefore, we can expect that a proper name will be more demanding for working 
memory than the first-person pronoun in processing the discontinuous object-verb 
relationships. 

Table 1 shows some of the experimental sentences. One hundred and 60 sen- 
tences were generated for each of the two types of sentences with the subordinate 
subjects as proper names for the first half and as the first-person singular pronoun 
for the second half. The experimental sentences with scrambled order were gener- 
ated from the sentences with canonical order by long-distance scrambling. In gen- 
erating the experimental sentences, repetition of words was avoided, except for 
proper names, which were unaccented and comprised two characters and three 
morae. The number of characters of morae and the initial sequence of accents 
in a sentence were controlled for the two types of sentences. Eighty fillers were 
included in the main session. The 320 experimental sentences and 80 fillers were 
randomly divided into four blocks. 


2.1.3 Procedure 


The participants were seated in an electrically and acoustically shielded EEG 
chamber 1 m in front of a 19-inch LCD monitor. Each trial began with a button press 
by a participant, and a fixation point followed the button press at the center of the 
monitor for 1 second. A stimulus sentence was visually presented, phrase by phrase, 
after the fixation at the center of the monitor with a stimulus onset asynchrony of 
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Table 1: Selected experimental sentences (discontinuous elements indicated by underlines). 


(1) Verbal complement declarative 


P1 P2 P3 P4 P5 P6 

a. Canonical order 

FIH k [Z3E/T2: "az SA CT EE kd A BUR. 
name-TOP ` name/T-NOM pencil case-ACC stole-COMP . classmates-DAT insisted 

b. Scrambled order 

EXE AY FAL ik [ZEA KARIE 22A3—hZC ER 


, 


*Murata told the classmates that Adachi/I had stolen the pencil case. 


(2) Adverbial adjunct clause 


P1 P2 P3 P4 P5 P6 

a. Canonical order 

PRA Ik UKH/fh3à NV ave TEL RIOT Sab BL fe , 
name-TOP ` name/T-NOM PC-ACC broke-because electronics store-DAT phoned 


b. Scrambled order 
^vart Hi DKH/A 2: — HULIE1O C BAUR EL Tc. 
‘Okuda phoned the electronics store because Ohta/I had broken the PC.’ 


800 ms and an interstimulus interval of 100 ms. The participants were required 
to judge grammaticality by pushing a button for each sentence (“grammatical” or 
“ungrammatical”). The order of the presentation of the stimulus sentences was 
pseudorandomized for each participant. The experiment was controlled using 
STIM2 software (Neuroscan). The practice session comprised 10 trials, the main 
session comprised four blocks, and the participants were allowed to rest for three 
to five minutes between blocks. The experimental sessions, including instruction 
and electrode applications, lasted approximately two hours. 


2.1.4 Electroencephalogram recording 


A continuous EEG was recorded from 21 Ag/AgCl sintered electrodes mounted on an 
elastic cap (19 positions of the international 10/20 system, and FCz and Oz). Vertical 
and horizontal electrooculograms (EOG) were simultaneously recorded from elec- 
trodes below the left eye (VEOG) and at the outer canthus of the right eye (HEOG). 
The signals were sampled at 500 Hz with a bandpass filter of DC to 100 Hz with the 
reference electrodes positioned at the two earlobes. The electrode impedance was 
maintained at a level lower than 10 kQ during the sessions. The EEG data were 
continuously acquired using SCAN4 software (Neuroscan). 


64 = Shingo Tokimoto and Naoko Tokimoto 


2.1.5 Electroencephalogram data preprocessing 


The acquired EEG data were processed offline using EEGLAB (Delorme and Makeig 
2004). The preprocessing steps were as follows. (1) The data were high-pass filtered 
at 1 Hz to minimize low drifts with the reference of linked earlobes. (2) Line noise 
was removed using the CleanLine plugin in EEGLAB. (3) High-amplitude artifacts 
were removed from the EEG data using artifact subspace reconstruction (Mullen 
et al. 2015). (4) The data were decomposed using an adaptive mixture of independ- 
ent component (IC) analyzers (AMICA) (Palmer et al. 2007). (5) We calculated the 
best-fitting single-equivalent current dipole for each IC to match the scalp projec- 
tion of each IC source using a standardized three-shell boundary element head 
model. We aligned the electrode locations according to the 10-20 system with a 
standard brain model (Montreal Neurological Institute). (6) We evaluated the pos- 
sibilities of the sources for each IC with the ICLabel plugin in EEGLAB (Pion-Tona- 
chini, Makeig, & Kreutz-Delgado 2017): brain neural activity, EOG, muscle poten- 
tials, electrocardiogram, line noise, channel noise, and other. We chose the ICs for 
which the possibility of brain neural activity was greater than 7096 for the follow- 
ing analyses. (7) We excluded ICs from further analysis for instances in which the 
equivalent dipole model explained less than 85% of the variance in the correspond- 
ing IC scalp map. (8) We segmented the data into time epochs from -1 to 2 seconds 
relative to the event markers. 


2.2 Results 
2.2.1 Behavioral results: Grammatical judgments 


Figure 2A presents the mean grammatical judgment rates for the two types of subordi- 
nate clauses, two word orders, and two subordinate subjects. Figure 2B presents their 
decision tree as independent variables. The main word order effect was significant, 
and grammatical judgments were significantly more for Comp than Adjunct in canon- 
ical and scrambled orders. The subordinate subjects exhibited no significant effect. 


2.2.2 Event-related potential analysis 


We analyzed the ERP at the third and the fourth phrases with a prestimulus baseline 
of 100 ms. The comparison between the conditions was corrected by cluster-based 
permutation tests. The analyses ofthe condition effects in ERP were performed using 
the STUDY command structure in EEGLAB. Nonparametric random permutation sta- 
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tistics were computed to test the significance of the condition effects. In this study, 
we computed 2,000 random permutations and compared them to the t-values for the 
mean condition differences. 


Event-related potential at the third phrase 

The processing contrast in the third phrases between canonical and scrambled 
orders is shown in (15). No Japanese verb takes a topically marked human NP, a 
nominatively marked human NP, and an accusatively marked NP as its arguments. 
Therefore, the presence of a complex sentence is recognized here. 


(15) a. Canonical order 
P1 P2 P3 
NP-TOP [name/T-NOM NP-ACC... 
b. Scrambled order 
P1 P2 P3 
NP-ACC NP-TOP [name/T-NOM... 


Further, in scrambled order the sentence-initial accusatively marked NP can establish 
a discontinuous dependency with the verb to come later given that the topic NP in the 
second phrase intervenes between the two. 

Figure 3 presents ERPs time-locked to the onsets ofthe third phrases for canonical 
and scrambled orders. We observed a significant anterior negativity, but the negativity 
did not reach a significant level in the topography of the standard LAN/N400 (300 to 500 
ms) time window. We also observed a significant parietal-occipital positivity for the 
P600 (500 to 800 ms) standard time window. No significant difference was observed at 
the two EOG electrodes (VEOG and HEOG) in the -100-800 ms time window. 

Figure 4 presents ERPs time-locked to the onsets of the third phrases for the two dif- 
ferent subordinate subjects for canonical and scrambled orders. The anterior negativ- 
ity reached a significant level in the topography of 300 to 450 ms only when the subor- 
dinate subjects were proper names. The ERP waveforms at frontal electrodes indicated 
that the anterior negativity lasted longer for the proper names than for the pronoun. 
Table 2 presents the correlation between the maximum amplitude of the anterior 
negativity and the parietal-occipital positivity in the time windows in which the 
contrasts between the canonical and scrambled orders were significant in the 
topography (250 [500] to 350 [800] ms for the negativity [positivity]). The correla- 
tions between the negative ERP and the positive ERP were almost significant (r = 
0.42, p = .053). The mean amplitudes of the negativity were negative; thus, the posi- 
tive correlation coefficient indicates that a participant with a greater negative ERP 
magnitude showed a smaller positive ERP magnitude. 
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Table 2: Means of the maximum amplitudes of the anterior negativity and the parieto-occipital 
positivity (SD), with r as their correlation coefficient. 


Negative ERP Positive ERP r 
Electrodes FP1, FP2, F7, F3, Fz, F4, P7, P3, Pz, P4, P8, 

F8, FCz, Cz 01, Oz, 02 
Time window (ms) 250-350 500-800 
Mean of the maximum amplitudes (uV, SD) -1.29 (1.17) 1.72 (0.89) 0.42* 


*p «0.1 


Event-related potential at the fourth phrase 

The processing contrast in the fourth phrases between Comp and Adjunct is sche- 
matically shown in (16) and (17), respectively (Discontinuous elements are indi- 
cated by underlines). The thematic correspondence between the sentence-initial 
NP and the subordinate verb and their syntactic relationship is established here. 


(16) Verbal complement declarative 
a. Canonicalorder 
P1 P2 P3 P4 


NP-TOP [NP-NOM NP-ACC verb]-complementizer ... 
b. Scrambled order 

P1 P2 P3 P4 

NP-ACC NP-TOP [NP-NOM verb]-complementizer .. . 


(17) Adverbial adjunct clause 
a. Canonical order 
P1 P2 P3 P4 


NP-TOP [NP-NOM NP-ACC verb]-because... 
b. Scrambled order 

P1 P2 P3 P4 

NP-ACC NP-TOP [NP-NOM verb]-because... 


Figures 5 and 6 respectively present ERPs time-locked to the onsets of the fourth 
phrases for the two types of subordinate clauses for canonical and scrambled 
orders. We observed a significant negative deflection for scrambled order relative 
to canonical order in the parietal-occipital region in the 400-500 ms time window 
for Comp and Adjunct. We also observed a positive deflection in the broad regions 
for scrambled relative to canonical order in Adjunct in the 600-800 ms time 
window. No significant difference was observed at the two EOG electrodes in the 
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—100—800 ms time window. The peak latency of the positivity at FP1 and Fz was pos- 
itively correlated with the individual grammatical judgment rate for the scrambled 
Adjunct sentences. 


3 Discussion 


We observed anterior negativity in the third phrase in scrambled order, and the 
negativity was more salient when the subordinate subject in the third phrase was a 
proper name than when it was the first-person singular pronoun. We can, thus, under- 
stand the anterior negativity to be a manifestation of additional working memory 
load for the discontinuous dependency because a proper name introduces a new 
discourse referent whereas a first-person pronoun does not. A parietal-occipital 
positivity followed the anterior negativity, and the positivity at this processing point 
was often interpreted as a manifestation of syntactic integration at the “pregap posi- 
tion” between the sentence-initial subordinate object and the subordinate verb to 
come, with the assumption that a gap is placed at the base position of a subordinate 
object from which the object is moved to the beginning of the sentence (Hagiwara 
et al. 2007). Note that some researchers have claimed that biphasic negativity and 
positivity can be functionally linked (Van De Meerendonk et al. 2008; Kim, Oines, & 
Miyake 2018). In our experiment, the magnitude of the parietal-occipital positivity 
was negatively correlated with that of the preceding negativity, which suggests that 
the two ERPs are functionally linked. A late positivity (P600) can be elicited by a rea- 
nalysis. The defendant in (18a) is temporarily ambiguous between the matrix object 
and the subordinate subject, and it is preferably interpreted as the former. Thus, 
we can assume that the defendant is reanalyzed as the subordinate subject at the 
input of was. As one of the early relevant studies, Osterhout, Holcomb, and Swinney 
(1994) visually presented (18) word by word with the SOA and the ISI set to 650 and 
300 ms, respectively, and they observed a positive ERP for was in (18a) relative to 
(18b) in the 500-800 ms time window in the parietal-occipital region. 


(18 a. Thelawyer charged the defendant was lying. 
b. Thelawyer charged that the defendant was lying. 


In our scrambled sentences, the sentence-initial accusatively marked NP is pro- 
cessed as an element of a simple sentence by default; thus, it is reanalyzed at the 
third phrase to be subordinate. This reanalysis at the third phrase is affected by 
the working memory resources available because the prior NP two phrases had to 
be retrieved for reanalysis. If the magnitude of the late positivity was a manifesta- 
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tion of the working memory resources available for reanalysis, it would be smaller 
when the subordinate subject was a costly proper noun. Thus, the parietal-occipital 
positivity at the third phrase may be a manifestation of the reanalysis of the sen- 
tence-initial NP to be subordinate. 

At the fourth phrases in scrambled order, we observed occipital negativity for 
Comp and Adjunct in the 400—500 ms time window and a significant late positivity 
only for Adjunct in the 600-800 ms time window. Establishing the verb-argument 
correspondence between the sentence-initial object and the subordinate verb was 
common for Comp and Adjunct, and the syntactic anomaly was peculiar to Adjunct 
in scrambled order. We also found a significant positive correlation between the 
peak latency of the positivity at several frontal electrodes and the individual gram- 
matical judgment rate for the scrambled Adjunct sentences. We can, thus, under- 
stand the occipital negativity as a manifestation of the thematic correspondence 
between the preposed object and the subordinate verb, and the late positivity as a 
manifestation ofthe detection of a syntactic island violation. Figure 7 schematically 
shows the time course of the processing of our experimental sentences in scram- 
bled order and the associated ERPs. 


* Pl P2 P3 P4 


NP-acc | NP-top [NP-nom  verb-complementizer/because 


Detection of Verb-arguments Detection 


biclausal thematic of syntactic 
dependency correspondence anomaly 


and reanalysis 
Late 
positivity 


Biphasic Occipital 
ERP sequence negativity and || negativity 
Figure 7: Event-related potential sequence in the processing of Japanese sentences with scrambled 


positivity 
order and their processing contents. 


Processing contents 


The first research question was to examine the possible difference in neural activ- 
ities between English and Japanese, as per their different head directions. We 
succeeded in the division of neural activities in a potential island construction in 
Japanese into several different aspects of processing. We observed four ERPs that 
could be confounded in the English counterparts. The second research question 
was to examine the correspondence between expected ERPs and the processing 
contents. As in Figure 7, the sequence of the four ERPs could be manifestations of 
the different processing in a possible island construction in Japanese. The third 
research question was to examine the possibility of attributing the relative weak- 
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ness of the island effect in Japanese to its typological properties in sentence pro- 
cessing. In Japanese (potential) island constructions, the different aspects of the 
construction are processed sequentially. Specifically, the syntactic relationship 
between the discontinuous elements is computed with them received. In English 
island constructions, the relevant processes are performed simultaneously at the 
head with one of the discontinuous elements unreceived. The syntactic computa- 
tion of a dependency with one of the two elements unknown is more costly than 
the computation with the two elements retained in working memory. The differ- 
ence in the processing time course can, thus, be one of the reasons for the rela- 
tively weak island effect in Japanese. 


4 Concluding remarks 


We have discussed the processing of two types of complex sentences in Japanese, 
and we have succeeded in the examination of different aspects of the processing in 
detail that can be confounded in the counterparts in English. The experiment sug- 
gests that a comparative study on Japanese, which is agglutinate and head-final, is 
helpful to discuss the universality and peculiarity of sentence processing. 


5 Limitations 


We have succeeded in clarifying the time series of the processing of discontinuous 
dependency in Japanese, paying close attention to the difference in head direction 
between English and Japanese. However, we have not found the reason a syntactic 
island in English is associated with the LAN in some cases and the P600 in others. 
Researchers have been discussing what the P600 means (Van De Meerendonk 
et al. 2008; Brouwer et al. 2016; Brouwer and Crocker 2017). We should examine 
the generators of the linguistic ERPs because an ERP can be an overlap of mul- 
tiple ERPs that are manifestations of neural activities in different brain regions. 
We should also discuss the connectivity between multiple brain regions because a 
mental function can be realized by the interaction in the network. 
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Katsuo Tamaoka 


Chapter 5 
The time course of SOV and OSV sentence 
processing in Japanese 


1 Introduction 


As native speakers acquire their language, the mental lexicon is developed and 
stored in their brains. The process of retrieving the meaning of a certain item from 
the mental lexicon is called lexical access. Psycholinguists often refer to the syntac- 
tic operation system believed to exist in the brain as the parser. The time course 
of sentence formation depends on syntactic complexity and semantic context. 
Any transitive verb indicates what kind of subject phrase and object phrases are 
required for constructing a sentence. For example, the verb “eat” in the sentence 
“Tom ate an orange" provides information on the subject “Tom” as an actor and 
the object *an orange" as a thing to be eaten. This type of information provided 
by the verb is called argument information. A transitive sentence in the verb-final 
Japanese language has two basic orders: SOV or OSV (S is subject phrase, O object 
phrase and V verb). These arguments of noun phrases (NPs) are marked by one 
of three markers: two case markers of nominative -ga (NPyow) and accusative -o 
(NPacc) and one topic marker -wa (NP yop). These three markers construct four vari- 
ations of SOV and OSV orders, as in sample sentences (1) to (4). All these sentences 
carry the same meaning of “(My) mother ate (an) apple." In this sentence, “mother” 
is understood as *my" mother unless otherwise specified. Moreover, there is no dis- 
tinction between plural and singular or definite and indefinite articles for *apple." 
These four formats seem to be processed differently even though they carry the 
same or at least similar meanings. Thus, an examination of how these sentences are 
differently processed and the underlying factors is warranted. 


(1)  SOV canonical order 
Haha ga ringo o tabe ta 
mother NOM apple ACC eat PST 
*(My) mother ate (an) apple." 
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(2 OSV scrambled order 
Ringo o haha ga tabe ta 
apple ACC mother NOM eat PST 


(3 Subject topicalized: the same order as SOV canonical 
Haha wa  ringo o tabe ta 
mother TOP apple ACC eat PST 


(4) Object topicalized: the same order as OSV scrambled 
Ringo wa haha ga tabe ta 
apple TOP mother NOM eat PST 


Sentence (1) “(My) mother ate (an) apple" is in the order of “(my) mother,” with 
nominative case marker -ga (NPyow), “(an) apple," with an accusative case marker 
-0 (NPacc), and finally a past tense verb (V-PST) “ate.” This sentence order is canon- 
ical in a transitive sentence. In sentence (2) the positions of NPyoy and NPacc are 
scrambled, as is characteristic of the OSV order. Again, the final verb *ate" appears 
at the end of the sentence. A slowing in the speed of sentence processing for the 
scrambled OSV order relative to the canonical SOV order is frequently observed 
(e.g., Tamaoka et al. 2005; Tamaoka et al. 2014; Tamaoka and Mansbridge 2019). 
Thus, the first question (Question 1) arises as to why an OSV scrambled sentence 
requires additional processing time over a canonical SOV sentence. On this matter, 
it is also true that the scrambled distance becomes even longer in a ditransitive 
sentence (0,SOt,V; t is trace and O, is the originally placed t, position) relative to 
a transitive sentence (04S t, V). A second question (Question 2) is then posed as 
to whether the scrambled distance affects the processing speed. If so, what is the 
factor responsible for it? 

The verb appears at the end of all sentences from (1) to (4) and contains the 
argument information necessary to create a sentence structure. In Japanese, this 
information is only available at the end of the sentence. Thus, a native Japanese 
speaker cannot identify the cases of NPs required to construct a sentence until the 
final verb is seen. The third question (Question 3) of whether a verb-final language 
is disadvantageous for sentence processing subsequently emerges. Furthermore, 
if native Japanese speakers can process a sentence without the argument infor- 
mation provided by a verb, the question of the function of the final verb for Japa- 
nese sentence processing (Question 4) is warranted. This question can be further 
rephrased to ascertain whether there is any use for argument information in a 
verb-final language. 

Noun phrases in Japanese are topicalized by the topic marker -wa (NPz055). The 
subject and object can be topicalized by the same topic marker. Sentence (3) is 
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an example of a subject-topicalized (NPsyp-top) sentence. This sentence is in the 
same order as the SOV canonical order. Subject topicalized sentence (3) starts with 
“Speaking of (my) mother" The subject topicalization also implies an exclusionary 
meaning, “mother,” not other family members. Sentence (4) is an object topical- 
ized (NPopy-rop) sentence. The object topicalization is in the same order as the OSV 
scrambled order. This sentence starts with *Speaking of (the) apple." This object 
topicalization also implies an exclusionary meaning, “(the) apple,” not other fruits. 
Finally, the fifth question (Question 5) asks how topicalization affects sentence pro- 
cessing. It can be restated to be measurable as follows: Does a topicalized sentence 
require longer or shorter processing time than the equivalent non-topicalized 
sentence? This chapter discusses these five questions in-depth. 


2 (Question 1): Why does an OSV scrambled 
sentence require more processing time than 
a canonical SOV sentence? 


The processing speed and accuracy of Japanese sentences are often measured by 
a sentence correctness decision task using experimental software (e.g., E-prime, 
DMDX, PsychoPy). In this task, asterisks ******** indicating an eye-fixation point 
are presented at the center of a computer screen. Soon after (a 600 ms interval 
is often used), a stimulus sentence with semantically coherent and anomalous 
responses is presented to participants in random order. Participants are asked to 
decide whether the sentences are semantically acceptable by pressing a *Yes" or 
*No" button. They are also asked to answer as quickly as possible while maintain- 
ing accuracy. The task measures the elapsed time between the presentation of a 
sentence and the participant's subsequent response. This interval is called reaction 
(or processing) time. Thus, reaction time includes accessing lexical items, construct- 
ing a syntactic structure, understanding the meaning of the whole sentence, and 
finally making the decision. 

The canonical order of SOV was found to be processed faster than the scrambled 
order of OSV in various psycholinguistic studies (e.g., Imamura, Sato and Koizumi 
2016; Koizumi and Tamaoka 2004; Mazuka, Itoh, and Kondo 2002; Miyamoto 2006; 
Miyamoto and Takahashi 2004; Tamaoka et al. 2005; Tamaoka et al. 2014; Tamaoka 
and Mansbridge 2019; Ueno and Kluender 2003; Witzel and Witzel 2016). Tamaoka 
et al. (2005) used a sentence correctness decision task to measure the processing 
time for SOV and OSV sentences such as those presented in (1) and (2). The pro- 
cessing time of an SOV sentence is shorter than that of a scrambled OSV sentence 
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(without any context). Japanese SOV sentences required 1,209 ms on average, while 
Japanese OSV scrambled sentences required 1,432 ms on average to process. The 
processing time difference between SOV and OSV sentences was 223 ms. The same 
trend occurred in processing accuracy for SOV sentences (M = 96.98%, M is the 
mean) over scrambled OSV sentences (M = 90.93%). 

The scrambling effect describes the delay in processing time and more fre- 
quent inaccuracy for scrambled OSV-ordered sentences over their SOV canonical 
counterparts. The sentence processing model of gap-filling parsing (Frazier 1987; 
Frazier and Clifton 1989; Frazier and Flores D’Arcais 1989; Frazier and Rayner 1982; 
Stowe 1986) provides one possible explanation for the delay with the scrambled 
OSV order. This scrambling can be explained as a syntactic operation of phrasal 
movement from the original locus (¢,) of the object (NP4cc-04) in the canonical posi- 
tion to the sentence-initial position as in [cp NPacc-04 [p NPyoy-£a [vp t; V]]], where IP 
is the inflectional phrase, and CP, complementizer phrase (or simply 04S t; V). The 
ti (gap) indicates the original position in the canonical order from which the NP-o; 
was moved to the sentence-initial position. 


L 
Filler-gap dependency 
(A shorter distance scrambling) 


Figure 1: The filler-gap dependency in a transitive sentence (04S t, V). 


From Figure 1, to accomplish the processing of a scrambled sentence, native Japa- 
nese speakers must recognize the initial NP4cc-0; as the filler and find its original 
position in VP (gap) to establish the filler-gap dependency. Here, given the degree 
of syntactic complexity, a canonical SOV-ordered sentence is expected to be pro- 
cessed more quickly than its OSV-ordered scrambled counterpart (04S t; V). 


3 (Question 2): Does longer-distance 

scrambling require longer processing 

time than shorter-distance scrambling? 
From Figure 1, the scrambled distance in a transitive sentence (0,S t; V) comprises 
only the subject NPyoy (S) between the filler (0) and the gap (t. It is called short- 


er-distance scrambling. For longer-distance scrambling (Tamaoka et al. 2005), a dit- 
ransitive sentence is used to measure the effect of scrambled distance. When the 
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original locus of the object (NP-o, hon-o “(a) book") in the canonical position of sen- 
tence (5) is moved to the sentence-initial position, as in sentence (6), the scrambled 
distance comprises the two NPs of NPyoy (S) and NPpar (O) between the filler (0;) 
and the gap (tı). This scrambling is denoted as O. S O t, V. 


(5) | SOOV canonical order 
Hanako ga Taro ni hon o kaesi ta 
Hanako NOM Taro DAT book ACC return PST 
*Hanako returned to Taro (a) book." 


(6) OSO t, V scrambled order 
Hon o Hanako ga Taro ni | kaesi ta 
book ACC Hanako NOM Taro DAT return PST 


From Figure 2, the gap, in O,S O t, V (t; is equal to gap) shows the original position 
in the canonical SOOV order from which the NP4cc-01 was moved to the sentence-in- 
itial position. Native Japanese speakers must recognize the initial NPAcc-0, as the 
filler and find its original position in gap, to establish the filler-gap dependency 
and process the scrambled sentence. Relative to the transitive sentence in Figure 1, 
the scrambling in a ditransitive sentence in Figure 2 can be considered a longer-dis- 
tance scrambling. The effect of the scrambled distance on processing can be probed 
by comparing the difference in the scrambling effect between transitive and dit- 
ransitive sentences. 


L J 
Filler-gap dependency 
(A longer distance scrambling) 


Figure 2: The filler-gap dependency in a ditransitive sentence (OS O t, V). 


Figure 3 shows the processing speed results for shorter- and longer-distance 
scrambling in (di)transitive sentences. As noted, Tamaoka et al. (2005) report that 
the difference in the processing time of a transitive sentence between SOV and 0, 
S t, V De, the shorter-distance scrambling effect) orders was 223 ms. They show 
that canonical SOOV ditransitive sentences required 1,359 ms on average, while the 
corresponding 04$ O t, V scrambled sentences required 1,963 ms on average. The 
difference in processing time (i.e., the longer-distance scrambling effect) was 604 
ms. Thus, the difference in the scrambling effect between shorter- and longer-dis- 
tance scrambling was 381 ms (604 ms — 223 ms). The magnitude of the scrambling 
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effect was large even though the scrambled distance differed by only a single NP 
between transitive and ditransitive sentences. 

Further, the percentage difference in processing accuracy between an SOV 
and an 0,8 t, V transitive sentence was 6.05%, while the difference in the accuracy 
between an SOOV and an 04S O t, V ditransitive sentence was 10.00%. The differ- 
ence in the accuracy of the scrambled effect between transitive and ditransitive 
sentences was 3.95% (10.00% — 6.05%). Relative to shorter-distance scrambling, 
longer-distance scrambling was more challenging to accurately process. Thus, the 
scrambled distance caused a larger delay in reaction times and a higher rate of 
errors, even with a single NP difference. 


2,400 
2,200 = Canonical 1,963 
2,000 [ — ]Scrambied +643 


1,800 A604 

1,600 E Wé 

1,400 1209 A223 E320 
+238 *** 


1,200 
1,000 
800 


Reaction times (ms) 


SOV order — O,S 4V order SOOV order O,SO t, V order 


Shorter distance scrambling Longer distance scrambling 


Figure 3: The scrambling effect of shorter- and longer-distance scrambling. 
Note: *** p<.001. ^ is the scrambling effect and + is a standard division. 


In Tamaoka et al. (2005), canonical SOV or SOOV sentences were re-arranged into 
scrambled OSV or OSOV orders, respectively, such that each pair of canonical and 
scrambled sentences carried the same meaning. Moreover, SOV-OSV and SOOV- 
OSOV conditions were presented under the same experimental condition, where 
no previous contextual information was given to native Japanese speakers for the 
sentence correctness decision task. One weakness may have been that the compari- 
son in differences in the scrambling effect between the shorter SOV-OSV and longer 
SOOV-OSOV scrambling distances emerged from sentences not being semantically 
identical. The difference between the shorter and longer scrambling distance was, 
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however, large at 381 ms in speed and 3.9596 in accuracy. As Tamaoka et al. (2005) 
used the simplest syntactic structures of sentences to probe the scrambling effect, 
the distance effect in scrambling seems to exist as an inhibitory effect in sentence 
processing. This effect may stem from the distance difference in the filler-gap 
dependency for performing the gap-filling parsing, as in Figures 1 and 2 for shorter 
and longer-distance scrambling, respectively. 


4 (Question 3): Is a head (verb) final language 
disadvantageous for sentence formation? 


Head-driven parsing (Pritchett 1988, 1991, 1992) suggests that syntactic phrasal 
structures are established by the head verb that provides necessary argument 
information for the construction of a sentence (Ikuta et al. 2009; Wolff et al. 2008). 
According to this processing model, understanding the information provided by the 
verb is the key to sentence formation. The head-driven parsing model applies best 
to English and other European languages. However, when the model is extended to 
other languages, it raises an additional question: are head- (verb)-final languages, 
such as Japanese and Korean, disadvantageous for sentence formation, unlike 
head- (verb)-initial languages, such as Kaqchikel and Tongan? 

Regarding the head-final language of Japanese, the transitive verb *eat" pro- 
vides information for the two arguments of the agent *(my) mother" with a nom- 
inative case marker (NPyom-ga) and the theme “(an) apple" with an accusative case 
marker (NP,cc.) in sentence (1). The two NPs are linked by the verb “eat.” However, 
as the verb is at the end of the sentence, argument information cannot be utilized 
by native Japanese speakers. One can follow the two NPs with other verbs, such as 
“buy,” “wash,” and “cook.” According to the head-driven parsing model, the verb- 
final position required for a head-final language causes confusion among native 
speakers, which may explain the delay in sentence processing. However, in such a 
situation, native Japanese speakers can combine the two NPs to get a head start on 
processing the whole sentence until the final verb *eat" becomes available. 

By contrast, when the head-driven parsing model is applied to verb-initial lan- 
guages, such as many Austronesian (e.g., Tagalog, Hawaiian, and Tongan) and Mayan 
(e.g., Kaqchikel, Tz’utujil, and Achi) languages, a great advantage is expected in pro- 
cessing sentences. Native speakers of these languages obtain the argument information 
from the verb at the beginning of a sentence and can easily process a whole sentence 
based on argument information. Thus, a distinct difference in reaction time is observed 
between a verb-initial language and a verb-final language: a sentence of a verb-initial 
language is processed much faster than an equivalent sentence of a verb-final language. 
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Koizumi et al. (2014) document reaction times for transitive sentences in 
canonical VOS and scrambled VSO orders in the verb-initial Kaqchikel language. 
Sentences with two NPs and a single verb in Kaqchikel transitive sentences may 
be considered equivalent in constituent elements to a Japanese transitive sentence. 
The mean reaction times for canonical and scrambled orders in both languages are 
comparable in cognitive load for sentence processing. As in Figure 4, processing of 
canonical VOS sentences in Kaqchikel took 3,403 ms on average (89.39% of accuracy) 
while scrambled VSO sentences took 3,601 ms on average (77.10% of accuracy). The 
scrambling effect measured 198 ms (3,601 ms in VOS - 3,403 ms SOV). The magni- 
tude of the scrambling effect in Kaqchikel was similar to that of Japanese at 223 ms. 


4,000 ES Canonical L] Scrambled 3,601 
3,403 +674 
3,600 +673 A198 
Scrambled ex 
3,200 3,601 - 1,432 = A 2,169 
E [ 1 
£ 2,800 
Se Canonical 
g 2,400 3,403 - 1,209 = A 2,194 
E 2,000 
E 1,432 
E 1,600 1,209 +308 
+238 A223 
1,200 pem 
800 
400 
0 


SOV order O,St,V order VOS order ` V f, SO, order 


Japanese transitive sentence Kaqchikel transitive sentence 


Figure 4: The scrambling effect in transitive sentences in Japanese (visual presentation) 
and Kaqchikel (auditory presentation). 

Note: ** p«.01. *** p«.001. + is the standard division. 

Aisthe difference in the scrambling effect. 


Since Kaqchikel sentences were auditorily presented, reaction times include the 
length of time for pronouncing a whole sentence. Thus, longer decision times for 
sentences were expected. The difference in the average reaction times for process- 
ing whole transitive sentences in the canonical (or basic word order) was large 
at 2,194 ms (3,403 ms in Kaqchikel VOS — 1,209 ms Japanese SOV). Moreover, the 
difference for transitive sentences in scrambled order was also large at 2,169 ms 
(3,601 ms in Kaqchikel VSO - 1,432 ms Japanese OSV). This difference in reaction 
times between Japanese and Kaqchikel stemmed from the different presentation 
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methods used for the experiments: visual (auditory) presentation in Japanese (Kaq- 
chikel). Considering that the overall average duration of an auditory presented Kaq- 
chikel sentence is approximately 2,100 ms, native Japanese and Kaqchikel speakers 
are likely to process sentences at near equivalent times. Likewise, as in Figure 4, 
the magnitude of the scrambling effect was similar at 223 (198) ms for Japanese 
(Kaqchikel). Thus, the result does not support the existence of an advantage in sen- 
tence processing time in the verb-initial (final) language of Kaqchikel (Japanese). 
Japanese and Kaqchikel seem to have similar processing times for canonical and 
scrambled sentences. 

Given that Kaqchikelis a unique language in that the object precedes the subject 
in its VOS basic word order, examining another head-final Austronesian language, 
such as Tongan, may add more data useful for analyzing the scrambling effect. In 
Tongan, VSO is the canonical order of transitive sentences, while VOS is also gram- 
matically possible as a scrambled order sentence (Churchward 1953; Custis 2004; 
Dixon 1979, 1994; Otsuka 2000, 2005a, 2005b). During a sentence correctness deci- 
sion task, native speakers of the verb-initial Tongan language were observed to 
process simple transitive sentences, such as *The woman ate the fish." The result 
showed an average speed of 1,643 ms for the VSO canonical order and 1,753 ms for 
VOS scrambled order (data yet to be published). The difference in speed was 110 
ms (significant at 0.001 level). Although the magnitude of the scrambling effect was 
much smaller than that for Japanese and Kaqchikel, the scrambling effect was still 
apparent in Tongan. 

Regarding Kaqchikel and Tongan, there is no advantage evidence in the pro- 
cessing of head- versus verb-initial language sentences. Therefore, the verb-ini- 
tial order cannot be considered advantageous in sentence processing. The head- 
driven parsing model may be a good fit for English and other European languages. 
However, the head-final language of Japanese seems to rely on features other than 
verb information for advantageous sentence processing. Notably, Tamaoka and 
Mansbridge (2019) show that the argument information provided by the verb is 
an important factor for processing a sentence in Japanese. This factor will be intro- 
duced in the following section. 


5 (Question 4): What function does a finally 
positioned verb have for sentence processing? 
If the verb-initial position is not advantageous for sentence processing, how is a 


sentence in the verb-final language of Japanese processed? Kamide and Mitchell 
(1999) and Kamide, Altmann, and Haywood (2003) provided evidence for pre-head 
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processing using the “visual-world” eye-tracking paradigm. In this paradigm, mul- 
tiple pictorial items are presented on a single screen, some of which regard a sen- 
tence that is auditorily presented. Participants look at this screen for approximately 
one second. A sentence is then auditorily presented and the sequential duration 
of eye-fixation times is recorded by the eye-tracker. Kamide and Mitchell (1999) 
and Kamide, Altmann and Haywoodet (2003) find that participants were likely to 
focus on pictorial items on the screen, which had not yet been auditorily presented. 
Accordingly, Kamide and Mitchell (1999) suggested that advance planning for com- 
prehending sentences occurs incrementally before the final verb is seen. Native 
Japanese speakers can anticipate the formation of a sentence based on the argu- 
ment information provided by the NPs' case markers. 

Figure 5 shows the processing sequence of the Japanese SOV sentence (1). 
Native Japanese speakers see or hear the agent haha-ga “(my) mother" first. They 
can identify the subject phrase by referring to the nominative case marker -ga 
(NPyomga). At this stage, native Japanese speakers already know that *(my) mother" 
is the actor. Next, the theme ringo-o *(an) apple" with the accusative case marker 
(NPacc.o) follows. By relying on these case markers, native Japanese speakers can 
begin to form a sentence containing two NPs, as in [jp NP(mother)yom-ga [vp NP(ap- 
ple) acco. - .]]. Then, they simply wait for the ending verb tabe-ta “ate” to complete 
the sentence. Thus, for pre-head anticipatory processing (Kamide and Mitchell 
1999; Kamide, Altmann and Haywoodet 2003; also see Kamide 2008; Altmann and 
Kamide 1999 for general discussion), the first part of a Japanese sentence is formed 
before seeing the ending verb. 


XP 
lip NPNom-g@ [vp NPacc-o ...]] 
NPyom-ga ECH 
haha-ga Bee 
‘(my) mother’ MP, Verb 
ringo-o tabe-ta 
*(an) apple* ‘ate’ 


Figure 5: Pre-head anticipatory sentence processing based on case markers. 


If anticipatory processing takes place as Kamide et al. (1999, 2003) describe, one 
may wonder whether a Japanese verb has any role in sentence processing. An 
eye-tracking study by Tamaoka and Mansbridge (2019) shows that even a simple 
transitive sentence with a shorter-distance scrambling, as in Figure 1, involves pre- 
head anticipatory processing (Kamide 2008; Kamide and Mitchell 1999; Kamide, 


Chapter 5 The time course of SOV and OSV sentence processing in Japanese — 87 


Altmann, and Haywood 2003) and head-driven processing (Pritchett 1988, 1991, 
1992). Tamaoka and Mansbridge (2019) employ a set of sentences (7) and (8) to 
investigate the mechanism of processing. Note that sentence (8) is not the exact 
equivalent of the scrambled ordered sentence (7). Ideally, for measuring eye-fixa- 
tion times in each region, the words of paired canonical and scrambled sentences 
should be as similar as possible. For instance, the only difference in the NP in Region 
1 was the single case marked Taro, -ga in sentence (7) and -o in sentence (8). Sim- 
ilarly, Region 2 was controlled such that the only difference was the case marked 
Hanako, -o in sentence (7) and -ga in sentence (8). The ending verb syootaisi-ta 
“invited” was retained. Eye-fixation durations and regression-in and -out frequen- 
cies in each region were recorded using the EyeLink 1000 Core System (SR Research 
Ltd., Ontario, Canada) for whole sentence reading by native Japanese speakers. 


(7 SOV canonical order 
Region 1 Region 2 Region 3 
Taro ga Hanako o syootaisi ta 
Taro NOM Hanako ACC invite PST 
*Taro invited Hanako." 


(8 | O,St V scrambled order 
Region1 Region 2 Region 3 
Taro o Hanako ga syootaisi ta 
Taro ACC Hanako NOM Invite PST 


The results of processing transitive sentences in Figure 6 are recorded in milliseconds, 
with A indicating differences in fixation times between O, S t; V and SOV transitive 
sentences. Processing times for canonical ordered sentences were subtracted from 
those for scrambled ordered sentences. The involvement of pre-head anticipatory 
processing (e.g., Aoshima, Phillips, and Weinberg 2004; Aoshima, Yoshida, and Phil- 
lips 2009; Kamide 2008; Kamide and Mitchell 1999; Kamide, Altmann and Haywoo- 
det 2003; Miyamoto 2006; Mazuka, Itoh and Kondo 2002; Witzel and Witzel 2016) 
indicated a significantly longer “go-past time" of A 129 ms in Region 2 before the verb 
appears (see details of eye-tracking measurements in Tamaoka and Mansbridge 
2019). The ending verb in Region 3 also received a significantly longer go-past time 
of A 140ms. 

Additionally, evidence of heavy head-driven processing (Ikuta et al. 2009; Wolff 
et al. 2008) was seen in the re-reading time of ^147 ms in Region 2. Given that the 
gap; Cor tı) in O, S t, V scrambled sentences is found between NPwom-ga (S) in Region 
2 and the head verb (V) in Region 3, the significantly longer re-reading time sug- 
gested that native Japanese speakers read back to the crucial NPyom-gain Region 2 


88 —— Katsuo Tamaoka 


to check the argument structure of NPyom-ga and NPacc. after seeing the head verb. 
This trend was further supported by the occurrence of significantly higher regres- 
sion-in frequency of A 13% for O,S t; V scrambled sentences in Region 2 from the 
ending verb. 


Region 1 Region 2 Region 3 
Go-past time Go-past time 
A129ms ** A140ms ** 


<— 
A147ms *** 
Re-reading time 


Regression-in from the verb 
1396 gege 


Figure 6: Processing of scrambled transitive sentences observed by eye-tracking. 
Note: ** p«.01. *** p«.001. A (ms) is a difference in fixation time (O, S t; V - SOV). 


In summary, a scrambled ND. and NPyoy.4; order trigger pre-head anticipatory 
processing to form a partial sentence structure. However, commonly-used individ- 
uals' first names as two NPs do not provide sufficient information to establish the 
filler-gap dependency. The O: S t, V sentence was most often read up to the head 
verb and then read back to the crucial NPyomga. Evidence of reading backward 
from the ending verb to Region 2 and re-reading times in Region 2, as observed by 
Tamaoka and Mansbridge (2019), suggests that native Japanese speakers require 
the argument information provided by the verb to resolve the filler-gap depend- 
ency of scrambling even in simple transitive sentences. 

Processing scrambled sentences was further probed in complex sentences com- 
prising three NPs and two verbs in the [S [SOV] V] format. Tamaoka and Mansbridge 
(2019) embedded the stimulus sets in sentences such as (7) and (8) into complex 
sentences, as in sentences such as (9) and (10). Again, sentence (10) is not based 
exactly on sentence (9). However, a direct comparison of eye fixations and regres- 
sion-in and -out in an eye-tracking study requires only that the same noun is used 
in each Region. In sentence (10), the filler O; or NPacc-o1 was moved from the origi- 
nal locus of the embedded sentence (t;) to the sentence-initial position of Region 1 
as ([0, S [S t, V] V]). Notably, Tamaoka and Mansbridge (2019) used short-distance 
scrambling in complex sentences. However, an analysis of that condition is omitted 
here to simplify the discussion. 


Chapter 5 The time course of SOV and OSV sentence processing in Japanese — 89 


(9 [S [SOV] V] canonical order 
Region 1 Region 2 Region 3 Region 4 
Kenji ga Taro ga Hanako o syootaisi ta 
Kenji NOM Taro NOM Hanako ACC invite PST 
Region 5 
to kii ta 
COMP hear PST 
“Kenji heard that Masato helped Keiko.” 


(10) [0,S [S t, V] V] scrambled order 
Region 1 Region 2 Region 3 Region 4 
Kenji o Taro ga Hanako ga ` syootaisi ta 
Kenji ACC Taro NOM Hanako ACC invite PST 
Region 5 
to ki ta 
COMP hear PST 


The results of complex sentence processing, measured by eye tracking, as in 
Figure 7, are recorded in milliseconds, with A indicating differences in fixation 
times between canonical sentences [S [S O V] V] and scrambled sentences [O, S [S 
t, V] V]. Processing times for canonical ordered sentences were subtracted from 
those for scrambled ordered sentences. The initial NPacc, in Region 1 had no sig- 
nificant go-past time. The second NPyoy had a significant go-past time (M49 ms). 
As for Kamide et al. (1999, 2003), the OS order or the o-and-ga order in Regions 1 
and 2 can provide the initial basis for a partially constructed phrasal structure [rp 
NPsov-za [ve NPacc-o . . .]]. At the early processing stage (i.e., go-past time), there was 
no extra processing in Region 3 for a scrambled sentence. Commonly-used individ- 
uals' first names as the three NPs do not provide sufficient information to establish 
the filler-gap dependency. Thus, native Japanese speakers read ahead until they 
reach the subsequent two verbs: the verb in the subordinate clause in Region 4 
(A739 ms) and that in the main clause in Region 5 (41,147 ms). Eye fixation on these 
two verbs lasted much longer than on their corresponding complex sentences, 
as in Figure 7. At the later stage of processing, re-reading time and regression-in 
were highly significant in multiple Regions of complex sentences. These significant 
re-reading times were seen in all three NPs; NPacc-o in Region 1 (A292 ms), NPyom-ga 
in Region 2 (A256 ms), and NPyom.ga in Region 3 (A369 ms). After obtaining argument 
information from the verbs in the subordinate and main clauses, native Japanese 
speakers read back to all three NPs to properly form the noun phrase structure. 
Given that significant re-reading times were observed in Region 4 (A171 ms), this 
pattern is especially salient in cases where the verb is seen in the subordinate 
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clause. This trend was also seen in the significant frequencies of regression-in into 
NPacc-o in Region 1 (42496) and NPyoy.g; in Region 3 (41990). 


Region 1 Region 2 Region 3 Region4 Region 5 
Go-past time Go-pasttime Go-past time 
A49ms * A739ms *** .— A1,117ms 
FS EU 
Kata 
ch <—— |« — — — «4— —— 
A292ms*** | A256ms*** A369ms *** A171ms 


Re-readingtime ` Re-readingtime Re-reading time Re-reading time 


Regression-in Regression-in 
A24% *** A19% *** 


Figure 7: Processing of a scrambled complex sentence measured by eye-tracking. 

Note: * p«.05. *** p«.001. A (ms) is the difference in fixation times while A (%) is the difference in 
frequencies of regression-in. The differences were calculated by subtracting the processing time for 
the canonical order [S [SOV] V] from that of the scrambled order [01S [S t; V] V] in each region. 


As Kamide et al. (1999, 2003) have proposed, pre-head anticipatory processing is 
observed in head-final languages, such as Japanese. Furthermore, an eye-tracking 
study by Tamaoka and Mansbridge (2019) showed longer go-past times for verbs 
and re-reading times in all three NPs. Regression-in to NPyom-ga was found in simple 
transitive and complex sentences. Native Japanese speakers must establish a rela- 
tionship between the filler NPacc..; and the gap, after obtaining argument infor- 
mation given by the verb to perform gap-filling parsing. Thus, depending on the 
availability of processing cues, native Japanese speakers must perform pre-head 
and head-driven (post-head) processing for scrambled sentences. 


6 (Question 5): How does the nature of 
topicalization affect sentence processing? 


Topicalization in Japanese is produced by the topic marker -wa. The subject and 
the object can be topicalized, as shown by the subject topicalization ON Bas zo) in 
sentence (3) and the object topicalization (NPopj.ro») in sentence (4). A topicalized 
phrase is usually positioned at the beginning of a sentence. When a topicalized NP 
is placed in the second or even later position, the sentence becomes less accept- 
able. This phenomenon warrants further investigation via a simple questionnaire 
survey of naturalness judgments. 
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Shibatani (1990) proposed that NPsus-rop of transitive sentences (SropOaccV) 
are external to the IP in [cp NPsyp-rop: [ip ti [vp NPacc V]]]. In this structure, NPsup-rop 
belongs to a CP, placed structurally higher than the IP. If true, SpopOaccV will be more 
complex in syntactic structure than SyoyOaccV. The difference in structural com- 
plexity predicts that SzopO4ccV sentence processing will take longer than Synom OaccV 
sentence processing. Moreover, since StopOaccV involves only a topicalized move- 
ment, which does not move beyond any NP, the scrambled O4ccSyoyV order is antic- 
ipated to take longer to process than the SyopOaccV order. Moreover, the order of 
an object topicalized (NPog;-rop) transitive sentence is OzopSyomV, which is the same 
order as the scrambled order of O4ccSyoyV. Kuroda (1987) further proposed that 
NPogy-rop involves a topicalization movement and a scrambling movement. Because 
NPogj-ropinvolves movements of both topicalization and scrambling, sentence struc- 
ture becomes even more complex than the scrambled O,ccSyoyV. This difference in 
structural complexity yields a prediction that OzopSyoyV will require longer process- 
ing time than will OaccSyowV. As for the discussion of syntactic structure (Kuroda 
1987; Shibatani 1990), Imamura, Sato and Koizumi (2016) hypothesized the follow- 
ing order in sentence processing speed. 


SwomOacc¥ = StopOaccV < OaccSnomV < OropSwomV 


Canonical Topicalizatio Scrambled — Topicalization 
and Scrambled 


Figure 8: Assumed order of sentence processing speed based on syntactic complexity. 
Note: Reproduced from Imamura, Sato and Koizumi 2016, p. 5. 


Imamura, Sato and Koizumi (2016, Experiment 1) created four types of sentences 
exemplified in sentences (11) to (14) to confirm the hypothesized order of sentence 
processing in Figure 8. These stimulus sentences used commonly-used family names 
such as Sato, Suzuki, lida, and Hirota. The family names were used interchangeably 
between the subject (e.g., Suzuki-ga) and object (e.g., Suzuki-o). In this situation, 
native Japanese speakers could not utilize semantic cues of animacy contrast, such 
as “(my) mother” and “(an) apple” to construct a partial sentence structure before 
seeing the ending verb. 


(11) Synom Oacc V: Canonical order 
Satoo ga Suzuki o home ta 
Sato NOM Suzuki ACC praise PST 
*Sato praised Suzuki." 
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(12) Soen Oacc V: Subject topicalized order 
Satoo wa Suzuki o home ta 
Sato TOP Suzuki ACC praise PST 


(13) Oacc Snom V: Scrambled order 
Suzuki o Satoo ga home ta 
Suzuki ACC Sato ACC praise PST 


(14) Orop Snom V: Object topicalized order 
Suzuki wa  Satoo ga home ta 
Suzuki TOP Sato ACC praise PST 


2,000 
ESI -T opicalization +T opicalization 
1,800 
1,512 +425 
1,600 1,410 1,414 +450 
+364 +416 
— 1,400 
3 
Ë 1,200 
S 1,000 
= Multiple Comparison: 
š 800 SwomOacc¥ =SropOaccV <OaccSnomV € OropS 
600 
400 
200 


SwomOaccV StopOaccV OnccSyomV  OropSnomV 


Canonical Order Scrambled Order 


Figure 9: Processing topicalized sentences with canonical and scrambled orders. 
Note: + is the standard division. Taken from Imamura, Sato and Koizumi 2016, p. 6. 


Using a sentence correctness decision task, Imamura, Sato and Koizumi (2016) 
presented whole sentences to participants who were asked to decide whether the 
sentences were syntactically and semantically correct. Given that the four types 
of sentences presented in this part of the study contained the same constituent 
elements of nouns and verbs, reaction times were directly comparable. Figure 9 
shows the means of sentence processing speeds in Imamura, Sato and Koizumi 
(2016). There was a difference of only 4 ms in processing speed between the canon- 
ical SyomOaccV (M = 1,410 ms, M refers to the mean reaction time) and the subject 
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topicalized SropOaccV (M = 1,414 ms) sentences. Given no difference in processing 
time between the canonical SyoyO4ccV and the subject topicalized SyopOaccV sen- 
tences, the syntactic structure with movement, as proposed by Shibatani (1990) and 
Kuroda (1987), may not represent the “psychological reality" of sentence process- 
ing. This result implies that subject topicalization may occur without any move- 
ment. Imamura, Sato and Koizumi (2016) explain that SyoyOaccV and SropOaccV 
would be processed as the SOV canonical order. 

The overall scrambling effect between SyomOaccV plus SyopOaccV (the mean of 
both conditions was M = 1,412 ms) and OaccSyomV plus OropSyomV (the mean of both 
conditions was M = 1,569 ms) was significantly greater at 157 ms. Thus, the scram- 
bling effect accords with previous studies (e.g., Koizumi and Tamaoka 2004, 2010; 
Mazuka, Itoh and Kondo 2002; Miyamoto and Takahashi 2004; Tamaoka et al. 2005; 
Tamaoka, et al. 2014; Ueno and Kluender 2003). 

The topicalized object sentences OzopSyoyV (M = 1,626 ms) took significantly 
longer to process than did the scrambled sentences OaccSnomV (M = 1,512 ms). The 
increasing order of processing speed of SnomOaccV € OaccSnomV € OropSnomV may be 
accounted for by syntactic complexity. In any case, Imamura, Sato and Koizumi 
(2016) discount Shibatani's (1990) topicalization movement proposal for SyopOaccV. If 
topicalization does not involve any movement, OropSnomV may involve only a single 
movement of scrambling: OropSyoyV takes on the same structure as OaccSyomV. Thus, 
no difference in processing speed between OaccSyomV and OyopSyomV should have 
been found in their experiment. Nevertheless, the results indicated that OropSyomV 
took longer to process than OaccSnomV. Hence, this result supports the idea of double 
movements of scrambling and topicalization proposed by Kuroda (1987). 

Recapping the processing result of Imamura, Sato and Koizumi (2016), the 
subject topicalized word order of SropOaccV is the same as that of the canonical 
SnomOaccV. Thus, as Imamura, Sato and Koizumi (2016) suggest, this order seems to 
be commonly used such that the structures of SnomOaccV and SropOaccV are easily 
understood within a short processing time. A simple explanation could be that the 
use of the topic marker -wa in the SOV order by native Japanese speakers is inter- 
preted as forming the subject with the topicalized NP-wa. Stimulus sentences by 
Imamura, Sato and Koizumi (2016) using two family names included an example 
of subject topicalization (S155) in the sentence Sato-wa Suzuki-o hometa. On reading 
the name Sato, it is understood that no other actor praised Suzuki. This exclusion- 
ary meaning may not function well for Stop and may also explain the lack of pro- 
cessing time difference between SyomOaccV and SropOaccV. The lesser degree of inter- 
pretability for the exclusionary meaning makes it easier to understand Syoy and 
Stop as simply the subjects of a sentence. 

Conversely, with the object topicalization (Oros) of Sato-wa Suzuki-ga hometa, 
Sato is the recipient of praise from the actor Suzuki. In this sentence, the recipient 
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object Sato is placed atthe beginning ofthe sentence by topicalization. However, this 
placement also stems from scrambling movement. By topicalizing the object Sato, it 
is probably easier to understand the exclusionary meaning. Among the group who 
could be praised, only Sato was praised by Suzuki. This semantic processing may 
require longer processing time than does the scrambled OaccSyomV. Consequently, 
the degree of ease for exclusionary interpretation by the topic marker -wa could be 
an additional factor in the resultant extra processing load for the object topicalized 
OropSnomV. 

The Japanese topic marker -wa appears to have dual functions of sentence-in- 
itial topicalization and exclusionary focus. As for Imamura, Sato and Koizumi 
(2016), future studies should clarify the exclusionary function of the Japanese topic 
marker -wa. Proof of a clear exclusionary meaning could be established by using 
sentences with animacy contrast. For example, sentence (3) haha-wa ringo-o tabe-ta 
has clear animacy contrast in that “my mother,” not any other family member, ate 
an apple. In object topicalization, the focus is on “an apple,” not any other fruits, 
which my mother ate. The function of the topic marker -wa could be clarified by 
setting a clear exclusionary meaning in a sentence. 

Further, to avoid interference from canonical and scrambled word orders in 
identifying topicalization in the Japanese language, it would be advisable to use 
other VSO-ordered languages, such as Tagalog, Hawaiian, and Tongan for com- 
parison. For example, the canonical order of the Tongan language is VSO (Church- 
ward 1953; Custis 2004; Dixon 1979, 1994; Otsuka 2000, 2005a, 2005b). As with the 
Japanese language, the subject and object may be topicalized. When the subject is 
topicalized, Srop is placed before the verb as SVO. Native Tongan speakers also per- 
ceive the topic of the sentence “mother” signifies that it was not any other family 
member who ate an apple. When the object is topicalized, Oro» is placed before the 
verb as OVS, signifying that it was *an apple," not any other fruit, the mother ate. 
The topicalized NP is placed before the verb as SVO or OVS. Thus, the word order 
of topicalization in the Tongan language does not overlap with either canonical 
or scrambled orders in Japanese. Hence, a verb-initial language such as Tongan is 
ideal for probing the processing function of topicalization. 


7 Closing remarks 


The question of how SOV and OSV transitive sentences in Japanese are processed 
can be summarized as follows: The SOV order is the basic structure for sentence 
processing in Japanese. An OSV scrambled sentence is processed using gap-filling 
parsing (establishing the filler-and-gap dependency). Even though pre-head antici- 
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patory processing will function before the final verb appears, head-driven process- 
ing (verb argument information) is also required, especially for processing an OSV 
scrambled ordered sentence. Topicalization of a Japanese subject marked by -wa 
has a dual function of stating the topic at the beginning ofthe sentence and adding 
an exclusionary meaning. Nevertheless, as subject topicalization uses the same 
word order as the canonical SOV, a topicalized noun phrase may be interpreted as 
being the subject, as is an NP marked by the nominative case marker -ga. It applies 
all the more so when there is no clear semantic distinction between the subject and 
the object. By contrast, object topicalization is understood as sentence-context top- 
icalization and exclusionary-semantic focus. The processing of the dual functions 
of topic marker -wa may explain the longer processing time required over the OSV 
scrambled order. Future studies can probe Japanese sentence processing by com- 
paring Japanese to languages with different canonical orders, especially verb-ini- 
tial languages, such as Tongan in which subject-object topicalized order does not 
overlap with canonical or scrambled order. 
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Daichi Yasunaga 

Chapter 6 

Sentence processing cost caused 

by word order and context: Some 
considerations regarding the functional 
significance of P600 


1 Introduction 


In the field of psycholinguistics, sentence comprehension is a topic of great interest 
(Miyamoto 2008; Koizumi 2015). Experiments using event-related brain potentials 
(ERPs) are often utilized as a research method to explore this topic (Sakamoto 2015). 
Among the ERP components, P600 is often used as an indicator of sentence-pro- 
cessing load (Coulson, King, and Kutas 1998; Friederici, Pfeifer, and Hahne 1993; 
Osterhout and Holcomb 1992; Osterhout and Mobley 1995; Hagoort, Brown, and 
Groothusen 1993; Hagoort, Brown, and Osterhout 1999; Kaan and Swaab 2003a, 
2003b). However, recent studies show various interpretations of the functional sig- 
nificance of P600. They suggest that the type of cognitive process the P600 reflects 
as an ERP component remains a matter of debate. This study conducts a prelimi- 
nary examination of how the pattern of P600 for the processing load of scrambled 
sentences changes per context (i.e., line drawings presented to the participants). 
Based on the results of this experiment, we aim to share materials to study the 
functional significance of P600. 

As many cognitive neuroscientists know, ERPs are negative and positive voltage 
changes in the ongoing electroencephalogram (EEG) that are time-locked to the 
onset of a cognitive event. P600 is an ERP effect that shows a positive peak around 
600 milliseconds (ms) after the onset of a stimulus. Given that the effects were first 
reported in the early 1990s, P600 has been considered an ERP component that 
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reflects the process of detecting or correcting syntactic anomalies (Osterhout and 
Holcomb 1992; Hagoort et al. 1993). However, several subsequent studies report that 
ungrammatical sentences produce a P600 effect, and grammatical sentences are 
structurally complex or (temporarily) ambiguous. Moreover, there have been reports 
ofa “semantic P600" effect that occurs in sentences containing inappropriate seman- 
tic roles without grammatical anomalies (Bornkessel-Schlesewsky and Schlesewsky 
2008; Hoeks, Stowe, and Doedens 2004; Kim and Osterhout 2005; Kim and Sikos 2011; 
Kuperberg et al. 2003, 2006, 2007). As reported by Vissers et al. (2008), for example, 
P600 is observed for sentences that do not adequately describe the preceding 
picture. A review of research reports over the last decade shows thatit is challenging 
to conclude that P600 reflects only (morpho-)syntactic processing load. 

This study reports the results of an experiment examining how the occurrence 
of the P600 effect is affected by processing loads created by syntactically complex 
sentences and the context formed by line drawings. Through this experiment, we 
describe changes observed in the appearance of P600 depending on the experi- 
mental conditions and provide facts to discuss the P600's functional significance. 


2 Word order and sentence-processing cost: 
Previous studies 


Previous research in psycholinguistics and cognitive neuroscience show that derived 
word orders tend to be relatively higher in processing cost than the syntactic basic 
word order of each language (for Japanese: Mazuka, Itoh, and Kondo 2002; Miyamoto 
and Takahashi 2002; Ueno and Kluender 2003; Koizumi and Tamaoka 2004; Tamaoka 
et al. 2005). Scholars propose that the reason for the higher processing cost of derived 
word orders is that the syntactic structure of derived word order is more complex 
than that of basic word order (Marantz 2005), and additional information processing 
such as filler-gap dependency processing is required (Gibson 2000; Hawkins 2004). 
Regardless of what model is assumed, we can predict that the processing cost of 
derived word-order sentences will be larger than that of basic word-order sentences. 
It suggests that measuring the size of the processing load can provide clues as to the 
basic word order of a language. 

Before reporting the experiments regarding this study, we introduce two related 
studies that explored the relationship between free word-order phenomena and 
sentence-processing load in Kaqchikel. Kaqchikel is the Mayan language spoken in 
Guatemala (Tay Coyoy 1996; Brown, Maxwell, and Little 2006; Lewis 2009). This lan- 
guage allows for free alternation of the subject (S), object (O), and verb (V) word 
orders. In the case of transitive sentences, all six logically possible word orders are 
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grammatical, including (1b) VSO and (1c) SVO, in addition to the VOS in (1a). Among 
these word orders, the VOS has been historically and theoretically considered the 
canonical word order (Rodríguez Guaján 1994; Tichoc Cumes et al. 2000; Ajsivinac 
Sian et al. 2004; England 1991; Aissen 1992). 


(1) a. VOS order X-ó-uchóy ri chaj ri ajanel. 
PAST-3sgABS.-3sgERG cut DET pine tree DET carpenter 
b. VSOorder X-$-u-chóy ri ajanel ri chaj. 
c. SVOorder Riajanel X-ó-u-chóy ri chaj. 


*The carpenter cut the pine tree." 


In a language that allows for such free alternation of word order, the answer to 
the question *What is the basic syntactic order of the language?" is often debated. 
The two studies presented here are cognitive neuroscience studies that examine 
the basic word order in Kaqchikel. Yasunaga et al. (2015) and Yano, Yasunaga, and 
Koizumi (2017) addressed a common question: *If VOS is the canonical word order, 
do the other word orders, as derived word orders, increase the processing load?" 
Despite this commonality, the two studies employ slightly different experimental 
methods. 

Yasunaga et al. (2015) used a picture-sentence (PS) matching task where partic- 
ipants were asked to judge whether the content of the preceding picture matched 
that of the following sentence. Their findings are as follows: When comparing the 
second region of the VOS and VSO word orders, P600 was observed for VSO. They 
noted that even in modern Kaqchikel, the basic word order in the mind of the 
native speaker is VOS, and the other word orders are processed as derived word 
orders. Furthermore, Yasunaga et al. (2015) identified several problems in this 
study. As the relative order of presentation was "picture, then sentence," perhaps 
the sentence structure was already created in the participants' minds by the time 
they saw the picture, and the EEG picked up inconsistencies in the matching 
process between the *sentence structure created first" and the *sentence actually 
heard." That is, the P600 reported by Yasunaga et al. (2015) did not reflect the load 
of processing a sentence with a complex structure; however, it could have reflected 
the load caused by processing a sentence different from the one constructed by the 
context (line drawing). 

Yano, Yasunaga, and Koizumi (2017) adopted a sentence-picture (SP) matching 
task in which the order of stimulus presentation was the reverse of that used by 
Yasunaga et al. (2015) to overcome the problems they identified. First, the sen- 
tence was presented, and then the participants were asked to judge whether it 
matched the content of the picture that followed. EEG measurements were taken 
while participants listened to the sentence; thus, the brain response was only 
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recorded before looking at the picture. The findings are as follows: When com- 
paring the third region of the VOS and VSO word orders, P600 was observed for 
the VSO word order. The authors noted that the filler-gap dependency could be 
processed in the second region because the context was given in Yasunaga et al.’s 
(2015) experiment. However, in Yano, Yasunaga, and Koizumi’s (2017) experiment, 
the context was not given—thus, the processing of the filler-gap dependency 
does not occur until the third region. The difference in gap-filling processing of 
the filler-gap dependency yielded a difference in the timing at which P600 was 
observed. 

Yasunaga et al. (2015) and Yano, Yasunaga, and Koizumi (2017) reported that 
P600 was observed for the derived word order in Kaqchikel. Two additional ques- 
tions, thus, arise. The first regards whether scrambled Japanese sentences also 
produce the P600 effect in the PS and SP tasks. Several prior studies report that 
P600 is observed in scrambled Japanese sentences, but whether it is observed in SP 
or PS tasks has yet to be reported. Second, performing the two tasks on the same 
participant allows for considering whether there is any new suggestion to deter- 
mine the functional significance of P600. Thus, we performed SP and PS tasks on 
the same participants using Japanese scrambled sentences. 


3 Experiments 


The experiment reported in this section was conducted during the COVID-19 pan- 
demic, which induced various limitations, such as the small number of partici- 
pants. 

In the experiment, data for the SP and PS tasks were obtained from each par- 
ticipant. The order of the SP and PS tasks was counterbalanced by the participants. 
Participants comprised 16 native Japanese speakers from Kanazawa University (10 
men and six women; M=21.2 years old, range 18-22 years). All participants were 
right-handed and had normal vision and hearing. Informed consent was obtained 
before participation, both verbally and in writing. The study was approved by the 
ethics committee of Kanazawa University. The experiment was conducted per the 
guidelines for face-to-face research under COVID-19 conditions set by Kanazawa 
University. 
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3.1 Stimulus 


The pictures used as stimuli in the experiment were the same as those used by 
Yano, Yasunaga, and Koizumi (2017), as shown in Figure 1. Six transitive actions 
that could be expressed using simple line drawings were adopted. Agents and 
patients were distinguished by four colors: blue, red, black, and white. The number 
of pictures in which the agents appeared on the left and right sides was balanced. 


yonda, "called" ketta, "kicked" shi katta, “scolded” 


nagutta, "hit" oshita, "pushed" nageta, "threw" 


Figure 1: Examples of the pictures used as stimuli. 


Two types of sentences were also prepared, as shown in (2). The first was SOV 
word-order sentences (referred to as SO order), in which the subject precedes the 
object (usually considered the canonical word order in Japanese). The other type 
was OSV order sentences (referred to as OS order), in which the object precedes the 
subject (usually called scrambled sentences). 


(2 a. SOorder aka-ga 80-0 yon-da. 
red-NOM blue-ACC called 
b. OSorder ao-o aka-ga jon-da 


“The red (person) called the blue (person)." 


We recorded the speech of a female speaker in the Tokyo dialect. The length of 
the audio stimuli was adjusted to minimize the variability in ERP response latency, 
as shown in Tables 1 and 2. 
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Table 1: Speech duration of each speech stimulus in the first and second 
regions (unit: ms). 


NP-gain SO NP-oinSO NP-gainOS — NP-oin OS 


aka, “red” 407 404 403 406 
ao, “blue” 400 400 403 409 
kuro, “black” 404 402 404 403 
shiro, “white” 410 408 411 405 
AVG. 405 404 405 406 


SD. 3.70 2.96 3.34 2.17 


Note: NP: noun phrase 


Table 2: Speech duration of each speech stimulus in the third region (unit: ms). 


SO order OS order 
nagutta, “hit” 471 472 
ketta, “kicked” 398 399 
oshita, “pushed” 416 414 
jonda, “called” 412 412 
shikatta, “scolded” 471 471 
nageta, “threw” 396 396 
AVG. 427 427 


SD. 31:7 31.9 


The stimulus onset asynchrony was set to 1,200 ms to avoid overlapping responses 
with the previous region. All speech materials were checked for unnatural pronun- 
ciation and inaudibility by three native Japanese speakers who did not participate 
in the ERP experiment. 


3.2 Procedures 


The SP task was conducted per the following procedure: First, a numbered count- 
down, followed by the white cross for the fixation, was displayed. When the cross 
turned red (depicted in black in Figure 2), the audio clip of the experimental sen- 
tence was played. After the sentence was presented, a picture was shown, and 
the participants had to press a button to respond regarding whether the sentence 
and picture matched. The EEG was recorded while the red cross was displayed. 
The entire task comprised 48 YES trials in the SO order, 48 trials in the OS order, 
and 96 NO trials, all presented to the participants in random order. The task was 
divided into six blocks with rest periods. The Psychtoolbox 3.0.17 (Brainard 1997; 
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Pelli 1997; Kleiner, Brainard, and Pelli 2007), which runs on MATLAB R2020a (The 
Mathworks, Inc.), was used to present the stimuli and acquire the participants’ 
behavioral responses. 


| Legi 
ay, 


Fixation 
200 ms 


Sentence Presentaion 
(Auditory) 


Picture Presentaion 
2,000 ms 


and 
Matching Task 
(Button Pressing) 


Figure 2: Procedure of the sentence-picture matching task. 


However, in the PS task, the picture was presented before the sentence. A count- 
down was followed by a picture that was displayed for 2,000 ms. A fixation cross 
was then displayed, and the sentence audio clip was presented. Subsequently, a 
question mark was displayed. The participant then answered whether the picture 
and the sentence conveyed the same meaning by pressing a button. This task com- 
prised 48 YES trials in the SO order, 48 trials in the OS order, and 96 NO trials. In 
the NO trial, for example, a picture of *a black person kicking a white person" was 
presented, as shown in Figure 3, but the audio clip *aka-ga ao-o nagutta. (The red 
hit the blue)" was played. Moreover, there were so-called *role reversal sentences" 
(sentences in which thematic roles are reversed). There were 48 trials in which the 
color of the actor or patient differed between the sentence and the picture, 24 trials 
in which the thematic roles were reversed, and 24 trials in which the verb content 
differed between the sentence and the picture. This task was divided into six blocks 
with rest periods. 

The EEG was recorded from 19 locations on the participant's scalp using the 
amplifier Polymate AP1532. Nineteen tin electrodes were used with an elastic cap. 
The linked earlobe served as the reference. The sampling rate was set to 1,000 Hz, 
and the bandpass filter was set to 0.01—100 Hz. 
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Figure 3: Procedure of the PS task. 


The EEG data were analyzed using EEGLAB v2020.0 (Delorme and Makeig 2004), 
running on MATLAB R2020a. Before averaging, the EEG data were downsampled 
to 200 Hz and re-filtered to 0.05—50 Hz. The ERP was quantified by averaging in the 
-200~1,000 ms time window (-200-0 ms was the baseline). Trials with ERP arti- 
facts exceeding +80 uV in these time windows were automatically eliminated. For 
exploratory analysis, the amplitudes were separated every 100 ms. The error trials 
that responded NO in the match condition or YES in the mismatch condition were 
also excluded in averaging. 


4 Results and discussion 
4.1 Behavioral data 


In both tasks, no difference in the percentage of correct responses to the matching 
task was observed regarding the word order (SP task: SO 96.3% = OS 95.2%; PS task: 
SO 98.2% = OS 98.2%). 

In the SP task, the reaction time was significantly shorter in the SO order than 
in the OS order (t(15) = 3.36, p < 0.05). However, in the PS task, the reaction time was 
numerically longer in the OS order than in the SO order, although the difference 
was not statistically significant (t(15) = 1.17, n.s.). Perhaps the difference in reaction 
time between the SO and OS orders did not reach significance in the PS task because 
the pictures provided sufficient contextual information. Some studies show that 
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constructions considered to lead to large processing costs in nature can be reduced 
by appropriate context. For example, Koizumi and Imamura (2017) show that the 
OS order in which the object is the old information demanded a smaller process- 
ing load than the OS order in which the object is the new information. Yano and 
Koizumi (2021) show that given the appropriate context, the processing difficulty of 
OS order becomes smaller than that of SO order. In the PS (SP) task, the participants 
pressed the button after hearing (seeing) the sentence (picture). Thus, we could 
not directly compare the reaction times across tasks because the participants made 
decisions at different times. 


Table 3: Correct response rate and reaction time for the two tasks. 


Correctness Reaction Times 
SO order in PS task 98.2 (1.82) 341 (193) 
OS order in PS task 98.2 (2.88) 351 (212) 
SO order in SP task 96.3 (4.30) 903 (355) 
OS order in SP task 95.2 (6.27) 960 (409) 


(units; 96 of correctness, ms in reaction time) 
Note: Numbers in parentheses are standard deviations. 


Summarizing the data for the behavioral indicators, the overall trend suggested 
that the OS order was more challenging than the SO order. This trend is consistent 
with several studies on Japanese scrambled sentence processing. 


4.2 Event-related brain potentials data 


First, we report ERP data for the SP task. In this task, the effect of word order on 
amplitude was observed only in the second region. In the 500—600 ms time window, 
the ERP for the OS order significantly and positively shifted relative to the ERP for 
the SO order (p « 0.05). 

However, in the PS task, the effect of word order on amplitude was observed in 
all regions. In all regions, the OS order was shifted in a positive direction relative to 
the SO order (R1: p « 0.05; R2: p « 0.01; R3: p « 0.001). Figures 5a, 5b, and 5c show all 
waveforms and topographies. 

In summary, the effect of word order was observed only in the second region 
of the SP task. In the PS task, this was observed in all the regions. In the following 
sections, we discuss the results for each region across the two tasks. 
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Figure 4: Event-related brain potential waveform and potential map in the second region of the SP 
task. The potential map shows the difference of OS order minus SO order in the 500-600 ms time 
window. 


4.3 Discussion 


P600 observed in the first region of the PS task: In the first region, P600 was 
observed only in the PS task and not the SP task. Regarding the PS task, a partic- 
ipant can anticipate what type of sentence will be an input based on the picture 
presented before the sentence, as noted by Yasunaga et al. (2015). This result can 
be interpreted as a *surprise" caused by the input of an element different from 
the participant's expectation, reflected as P600. Previous studies reported negative 
components for the first region in OS order (i.e., the object at the beginning of the 
sentence), reflecting working memory load and deviation from predictions regard- 
ing dependency construction (Hagiwara et al. 2007; Ueno and Kluender 2003). 
However, in this study, this negative effect was rarely observed in this domain, 
and P600 was observed instead. The author interprets this P600 as a component 
that reflects a more general cognitive load, independent of deviations from syn- 
tactic predictions. In the SP task, P600 was not observed in the first domain. Thus, 
the difference between the two tasks may stem from the presence (absence) of 
P600. Vissers et al. (2008) also report that P600 was observed in the PS task when 
a sentence was presented that did not match the content of the preceding picture. 
The P600 has a 500—700 ms latency and was observed in the whole scalp, mainly 
in Cz. P600, which was observed in the first region of this study, in the same time 
window, and with similar scalp distribution. Therefore, the P600 observed in the 
first region of the PS task reflects the same cognitive process as that noted by 
Vissers et al. (2008). 
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Figure 5: Event-related brain potential waveforms and potential maps of the picture-sentence 
matching task: (a) the first region, (b) the second region, and (c) the third region. The potential map 


shows the difference of OS order minus SO order. The time window of each map is shown under the 
map. 


110 ——  Daichi Yasunaga 


However, as no context was given in the SP task, P600 was not observed because no 
*surprise" occurred, unlike in the PS task. These interpretations suggest that P600 
is unlikely to be an ERP component specific to syntactic processing. If P600 is an 
ERP component specific to syntactic processing, the same degree of effect should 
be observed for the derived word order in the first region, regardless of the task. 


P600 observed in the second region of the SP and PS tasks: In the second region, 
P600 was observed in the SP and PS tasks. In the SP task, sentence processing takes 
place without the context (picture) being given; thus, the brain's response is likely 
to reflect sentence processing more purely. The P600 in the second region of scram- 
bled Japanese sentences accords with the results of previous studies, such as Hagi- 
wara et al. (2007), Ueno and Kluender (2003), and Yano and Koizumi (2018). Thus, 
this component, as observed in this study, may stem from the processing load of 
the filler-gap dependency associated with the processing of scrambled sentences. 

However, if P600 reflects only the syntactic processing load, the PS task should 
yield results similar to the SP task. Although the amplitude difference of the second 
P600s between the PS and SP has not yet reached statistical significance (n = 16, p = 
0.07), different scalp distributions of their amplitude effects are observed by visual 
inspection. The second P600 in the SP is widely distributed, suggesting that this P600 
possesses neural generators in deeper brain regions. However, the corresponding 
effect in the PS focally appears in the frontal regions. Thus, the second P600 in the 
PS may be generated from surface, localized cortical regions. If we accept the dis- 
tributional difference in the second P600s between the two tasks, beyond process- 
ing filler-gap dependency in the second region, a more general cognitive load was 
superimposed in the PS task." 

If we accept the difference in P600 amplitude between the tasks in the second 
region, beyond processing filler-gap dependency, a more general cognitive load, as 
observed in the first region, was superimposed in the PS task. 


P600 observed in the third region of the PS task: In the third region, the P600 was 
observed in only the PS task but not the SP task. In this region, there was no betrayal 
of participants' predictions for verbs, as verbs always appear in this region. More- 
over, we analyzed only cases where the picture and sentence matched. Unexpect- 
edly, the filler-gap dependency will be processed in this region. Therefore, we must 
consider that the P600 in the third region reflects a cognitive process different from 
the first and second regions. 


7 The discussion in this paragraph was built on comments from anonymous reviewer 22. Further 
clarification of the three relationships between the source of P600, its distribution on the scalp, 
and the functional significance of P600 will clarify the implications of the various P600s observed 
in this study. 
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Figure 6: Comparison of the strengths of the effects of the second region in the sentence-picture 
matching task (left panel) and in the picture-sentence matching task (right panel). 


In the PS task, the participants were required to make a final decision regarding 
whether the picture and sentence matched after the input of the third region. 
Participants reviewed the entire sentence again for the matching task. This task 
reconfirmed the mismatches in the first and second regions; thus, the processing 
load increased again. One piece of supporting evidence for this possibility is that 
the time window for P600 in the third region was relatively late. This somewhat late 
response was likely because the task required a process of synthesizing the context 
(picture) and the whole sentence. 


5 Summary 


This study addressed two questions. First, would scrambled Japanese sentences 
also produce a P600 effect during the SP and PS tasks? The answer is yes. P600 was 
observed in the region where the filler-gap dependency could be processed (the 
second region). Second, would we find any new insights regarding P600 by per- 
forming the two tasks on the same participants? In the PS task, where contextual 
information was available, P600 was also observed per the mismatch between the 
context and expectation and the participant’s final decision on the task. These facts 
suggest that P600 is not just a reflection of syntactic challenges, and superimposed 
P600 may also be observed with a broader range of cognitive loads (specifically, 
failure of expectations). 
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Masataka Yano 


Chapter 7 
The adaptive nature of language 
comprehension 


1 Introduction 


Human language-processing studies have generally focused on the fixed aspects of 
processing mechanisms, with an (often implicit) assumption that they do not vary 
over time (Bornkessel and Schlesewsky, 2006; Frazier, 1987; Friederici, 2002; Hale, 
2001; Lewis, Vasishth, and Van Dyke, 2006; MacDonald, Pearlmutter, and Seiden- 
berg, 1994). However, speakers vary in terms of their lexical and syntactic knowl- 
edge and preferences (Bloom and Fischler, 1980; Han, Musolino, and Lidz, 2016; 
Sprouse, Wagers, and Phillips, 2014). Given that language processing is highly 
predictive, such variation induces the language-processing system to encoun- 
ter prediction errors (Altmann and Kamide, 1999; Federmeier, 2007; Kamide, 
Altmann, and Haywood, 2003; Luke and Christianson, 2016). Thus, this brings up a 
question about how the language-processing system deals with repeated exposure 
to unexpected input in sentence comprehension. 

Recent behavioral studies show that comprehenders can rapidly adapt to lin- 
guistic characteristics of input (Creel, Aslin, and Tanenhaus, 2008; Farmer et al. 
2014; Fine and Jaeger, 2013, 2016; Fine et al. 2013; Kaan and Chun, 2018b; Kamide, 
2012; Kurumada, Brown, and Bibyk, 2014).! For instance, while people experience 
difficulty in the processing of garden-path sentences that require a revision of 
syntactic structures (e.g., *The experienced soldiers warned about the danger con- 
ducted the midnight raid"), Fine et al. (2013) found that the difficulty of process- 
ing garden-path sentences decreased as participants were repeatedly exposed to 
them in an experiment. Sharer and Thothathiri (2020) revealed that this syntactic 
adaptation effect was correlated with activation in the pars opercularis of the 
left frontal cortex. Importantly, adaptation differs from syntactic priming in that 
it is cumulative (i.e. a gradual convergence toward the statistics of the input) 
and long-lasting (Kroczek and Gunter 2017). However, the underlying mechanism 


1 For this chapter, we used the term “adaptation” to refer to when language processing changes in 
a relatively short period, resulting in a more suitable state for an environment. 
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has not been fully understood. Further, as the previous experiments are limited 
regarding target languages Oe, English and German) and constructions Oe, 
structural ambiguity resolution and alternation), the limit of the adaptive behav- 
ior is unknown. 


1.1 Underlying mechanisms 


Previous accounts can be broadly categorized into two positions. The expectation- 
updating account proposes that the language-processing system has a probabilistic 
belief about how often certain types of sentences appear. When encountering a less 
frequent and, thus, unexpected structure, the belief is updated accordingly (Fine 
et al. 2013; Fine and Jaeger 2013). Consequently, the same structure is processed 
more easily. The alternative interpretation, in contrast, the alternative interpreta- 
tion assumes that various syntactic frames are stored with different base-level acti- 
vation in declarative memory. When a less frequent syntactic frame is repeatedly 
used, its base-level activation increases; hence, its activation subsequently requires 
a lower cost to exceed the activation threshold (Reitter, Keller, and Moore 2011; cf. 
Kaan and Chun 2018b). Therefore, the processing of a priori infrequent syntactic 
frames is facilitated. We refer to this account as a representation-based account. 
As prior behavioral studies of adaptation have mainly focused on the processing of 
structural ambiguity resolution, it is difficult to tease these two accounts apart. 

For this chapter (Experiments 1 and 2), we tested the two accounts in two ways. 
Experiment 1 involved morphosyntactically ungrammatical sentences. If people 
can adapt to ungrammatical sentences, it would imply going against the representa- 
tion-based account because, by definition, such sentences do not have any licit syn- 
tactic frame; thus, the activation level cannot increase. To date, it remains unclear 
whether ungrammatical sentences trigger adaptation (Coulson, King, and Kutas, 
1998; Gunter and Friederici, 1999; Hahne and Friederici, 1999; Osterhout et al. 1996; 
Yoshida and Miyamoto, 2017). 

For Experiment 2, we tested whether the strength of predictive errors influ- 
ences adaptation using aspectual mismatches. The expectation-updating account 
predicts that a stronger prediction error will trigger updating an expectation about 
linguistic environments, because it signals that the belief substantially deviates 
from what should be expected. The representation-based account, on the other 
hand, does not predict such an outcome because the aspectual representation is 
the same, regardless of predictive strength. 
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1.2 Limits of linguistic adaptation 


Another issue addressed in this study involves the limits of adaptation. It is plau- 
sible to suppose that the language-processing system is well designed to adjust to 
a linguistic environment to process input efficiently. However, adaptation is pre- 
sumably not limitless because continuously tracking changes in linguistic environ- 
ments could be demanding of resources. 

Several prior findings suggest that different types of prediction errors have 
different effects on adaptation. For example, Hanulíková et al. (2012) observed that 
native speakers of Dutch exhibit no P600 effect for sentences with gender disagree- 
ment produced by a non-native speaker. In contrast, semantic violations elicited an 
N400 effect, suggesting that people are less likely to adapt to semantic violations 
(Grey and Hell, 2017; Romero-Rivas, Martin, and Costa, 2015, 2016). Thus, for Exper- 
iment 3, we tested whether repeated exposure to semantic violations mitigated pro- 
cessing difficulties in Japanese individuals. 

For this chapter, we used event-related potentials (ERPs) because they allowed 
for selectively tracking how processes of interest changed during the experiments. 
Further, by using ERPs as an index of sentence processing costs, we could avoid the 
possibility that apparent adaptation effects would result from response familiariza- 
tion involved in sentence plausibility judgment and self-paced reading tasks. 


2 Experiment 1: Morphosyntactic adaptation 
2.1 Method 


For Experiment 1, we tested the expectation-updating and representation-based 
accounts using morphosyntactic violations, as shown in (1b). The sentence in (1a) is 
grammatical, while that in (1b) entails a morphosyntactic violation, as an intransi- 
tive verb must mark a single argument with a nominative case (“-ga”), but not with 
an accusative case (“-o”), regardless of the agentivity of the argument in Japanese. 
We divided 52 pairs of the target sentences, such as (1), into two lists according to the 
Latin square design, such that each participant read 26 sentences of each condition. 


(1 a. Grammatical sentence: 
bara-ga kare-ta. 
rose-NOM wither-PST 
*The rose withered." 
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b. Morphosyntactic violation: 
*bara-o kare-ta. 
rose-ACC ` wither-PST 


The morphosyntactic violation has been reported to elicit a P600 effect compared to 
the grammatical counterpart in Japanese and in other languages (Nashiwa, Nakao, 
and Miyatani, 2007; Yano, 2018b; Yano and Sakamoto, 2016b). Thus, a decrease in 
the P600 effect over the course of the experiment can be interpreted as an adap- 
tation to morphosyntactic violations. However, the magnitude of the P600 effect 
could have changed during the experiment for several reasons other than adap- 
tation, such as participants' fatigue and lack of attention. To assess these effects, 
we manipulated the probability of grammatical and ungrammatical sentences and 
examined whether ERP differences between ungrammatical and grammatical sen- 
tences decreased only when the participants were exposed to a large proportion 
of ungrammatical sentences. We manipulated the ratio of the grammatical and 
ungrammatical sentences by intermixing filler sentences. In the equal-probabil- 
ity block, the ratio of grammatical and anomalous sentences was 1 to 1, while the 
ratio was 4 to 1 in the low probability block. We informed the participants that 
they would complete two blocks but not how they differed (i.e., the manipulation of 
probability is a within-participant factor). 

We recruited 20 native Japanese speakers from Tohoku and Kyushu Univer- 
sity. All participants (including those who participated in Experiments 2 and 3) had 
normal or corrected-to-normal vision and no history of reading disabilities or neu- 
rological disorders. We obtained written informed consent from all participants 
before the experiments. The experiments were approved by the ethics committees 
of Tohoku University's Graduate School of Arts and Letters and Kyushu University's 
Department of Linguistics. 

Sentences were presented word-by-word. We conducted statistical analyses 
using linear mixed-effects (LME) models, which allowed for handling continuous 
variables, including ITEM ORDER. The models included independent variables of 
interest: PROBABILITY (low/equal), VIOLATION (grammatical/ungrammatical), 
and ITEM ORDER of each block (1-130), with their interactions as fixed factors. 
The dependent variables were 700—900 ms mean amplitudes from the onset of the 
verbs (i.e., “withered”), calculated for each trial. The time window was determined 
based on the results of previous ERP studies in Japanese, which showed relatively 
late P600 effects (Mueller, Hirotani, and Friederici, 2007; Nashiwa et al. 2007; Yano, 
2018b; Yano and Sakamoto, 2016b; Yano, Suzuki, and Koizumi, 2018). As the topo- 
graphical distribution of P600 is well known, we used EEG data obtained from the 
centroparietal regions (Cz, Pz, C3/4, P3/4, P7/8, and O1/2). 
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The focus of this experiment was whether the magnitude of the P600 effect 
would change over the course of each block, depending on the probability of 
anomalous sentences. The representation-based account predicts that participants 
cannot adapt to morphosyntactic violations because ungrammatical sentences 
have no licit representation to be activated. Thus, the P600 effect should appear 
throughout both low- and equal-probability blocks. Alternatively, if the P600 effect 
were to change due to reasons other than adaptation, we expected the interaction 
of VIOLATION x ITEM ORDER in both blocks, in addition to the main effect of VIO- 
LATION. In contrast, the expectation-updating account predicts that participants 
can adapt to morphosyntactic violations with increasing exposure to them (though 
not necessarily). Statistically, it should induce a significant interaction of PROBA- 
BILITY x VIOLATION x ITEM ORDER, reflecting an attenuation of the P600 effect 
during the equal-probability block. 


2.2 Results 


The LME model showed a significant main effect of VIOLATION, with the ungram- 
matical sentences having larger P600 amplitudes (É = 2.38, t = 2.70, p < 0.05) 
(Figure 1). Importantly, the three-way interaction reached a significant level ( = 
-2.85, t= —7.277, p < 0.01). Planned comparison at each level of PROBABILITY showed 


Low Probability Equal Probability 


8 morphosyntactic violation. 
morphosyntactic violation 


a5 ty afe AA OA || E P e SST . | 
BI | ` 
H 4 d ` , grammatical 

2 AE Rees d 

9 Item Order 


Figure 1: The P600 change in the low (left) and equal (right) probability blocks of Experiment 1. 

The x-axis denotes item order, and the y-axis denotes the amplitude of the P600 in the time window 
of 700-900 ms. Positivity is plotted upwards. Each line shows the P600 changes estimated by the LME 
models, and each dot indicates a P600 amplitude for every 20 trials. The gray areas refer to the 95% 
confidence interval. 
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that ITEM ORDER had a different effect on the P600 effect for the two blocks. In 
the equal-probability block, the interaction of VIOLATION and ITEM ORDER was 
significant, with a decrease in the magnitude of the P600 effect (B = —1.57, t = —5.72, 
p « 0.01). In contrast, the P600 effect increased during the low probability block 
(B = 1.27, t = 4.56, p < 0.01). 


2.3 Discussion 


The results of the equal-probability block demonstrated that the language-process- 
ing system can adapt to morphosyntactically violated sentences. The decreased 
magnitude of the P600 effect stemmed from the ungrammatical sentences elicit- 
ing a smaller P600 amplitude as the experiment progressed. Thus, the outcomes 
corroborated the prediction by the expectation-updating account. Although the 
exact functional contribution of P600 to language comprehension remains debated 
(Bornkessel-Schlesewsky and Schlesewsky, 2008; Brouwer, Fitz, and Hoeks, 2012; 
Kaan and Swaab, 2003a, 2003b; Kolk et al. 2003; Kuperberg, 2007; Van Herten, Kolk, 
and Chwilla, 2005). Fitz and Chang (2019) recently proposed that a P600 reflects the 
cost of a learning process to develop an accurate probabilistic model. According to 
their computational model, when the language-processing system faces a process- 
ing error, it propagates the error back to the lower-level units to enable learning of 
probable input. If their interpretation of P600 is correct, the P600 reduction reflects 
a successful learning process that minimizes the processing error. 

Moreover, the observation that the opposite pattern was found for the low 
probability block supports the expectation-updating account. In the low probabil- 
ity block, the P600 amplitude decreased for grammatical sentences and increased 
for ungrammatical sentences. As the preverbal phrase provided useful information 
on the syntactic structure of a sentence in this block, the participants incorporated 
this information into their predictive computations. Consequently, processing was 
facilitated for the verb, attenuating the P600 amplitude of grammatical sentences. 
On the other hand, such an expectation should induce a severe processing cost for 
the verbs of ungrammatical sentences. Thus, the participants needed to repair the 
syntactic structure of the sentence upon encountering the verb. The rise in the P600 
amplitude observed for the ungrammatical sentences can be considered a conse- 
quence of such processing errors, since the participants expected a grammatical 
sentence more strongly as the experiment went along. 

In contrast, the present outcome is not compatible with the hypothesis that 
accounts for syntactic adaptation in terms of syntactic representation activation. 
According to this interpretation, increased base-level activation or the residual 
activation of syntactic frames facilitates the processing of subsequent sentences 
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with the same syntactic frames. Because ungrammatical sentences (e.g., case-as- 
signment violations) do not have a licit syntactic representation that Japanese 
speakers can build upon, it is impossible to increase the activation level of such 
syntactic representations. One may consider an alternative possibility that the par- 
ticipants built an improvised representation of the ungrammatical target sentences 
as they encountered them (cf. Kaschak and Glenberg 2004). However, this account 
is unlikely to explain the results because we kept the number of ungrammatical 
target sentences constant between the low- and equal-probability blocks; therefore, 
the repeated presentation of case-assignment violations should have facilitated 
their processing in the low- and equal-probability blocks to the same degree. We 
only observed the decline in the magnitude of the P600 effect for the equal-proba- 
bility block, which is at odds with the representation-based account.” 

Thus, we interpreted the results as evidence suggesting that expectation-updating 
is an underlying mechanism for (morphosyntactic) adaptation. 


3 Experiment 2: Aspectual adaptation 


For Experiment 2, we tested the two accounts from different angles. A strong 
expectation violation signals to the language-processing system that the current 
prediction model is incorrect. Hence, if the expectation-updating account is on the 
right track, the language-processing system should attempt to update its expecta- 
tion (Fine and Jaeger 2013). In contrast, the representation-based account does not 
make such a prediction as long as the representations in question do not differ. 
To test this prediction, we employed aspectual coercion, which refers to when 
an aspectual type is forced to shift to another type to reconcile an aspectual mis- 
match between temporal adverbials and verbs, for instance (e.g., Jackendoff, 1997; 
Moens, 1987; Moens and Steedman, 1988). For example, the originally semelfactive 
event of “sneeze” in “For 10 minutes, the student sneezed” is an iterative event given 
the co-occurrence with *for 10 minutes." 

Aspectual coercion induces additional processing costs compared to non-co- 
erced sentences (Bott, 2010; Long, 2011; Paczynski, Jackendoff, and Kuperberg, 2014; 
Yano and Sakamoto, 2016a). Yano (2018b) found that processing costs reflect two 
types of processes: an aspectual prediction error reflected by an ERP component 
called early anterior negativity (AN) and an aspectual mismatch resolution process 


2 The representation-based account cannot also account for the P600 increase in the grammatical 
sentences of the equal probability block because it should expect the opposite pattern such that the 
P600 amplitude decrease as the participants read more sentences, reflecting increased activation. 
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reflected by a later AN. For Experiment 2, we focused on the early AN because we 
were interested in whether predictive processing plays a role in adaptation. 


3.1 Method 


We used two types of aspectual coercion: the additive and subtractive types, as 
exemplified in (2) and (3). The additive type refers to an aspectual shift from an 
atelic to a telic interpretation, as shown in (2b). In the subtractive type, a telic inter- 
pretation turns into an atelic interpretation given a for-adverbial phrase, as pre- 
sented in (3b). We assessed the additive and subtractive coercion effects against 
each control condition—portrayed in (2a) and (3a), respectively—which involves 
no aspectual shift. In addition to these two factors (coercion and type), we manipu- 
lated the temporal distance between the adverb and the verb, such that the adverb 
was placed at the sentence-initial position in Experiment 2A, while it was adjacent 
to the verb in Experiment 2B. Assuming that linguistic prediction develops as a 
function of time (even when available information does not change), a stronger 
prediction error should have occurred in the coercion conditions of Experiment 2A 
(Chow et al. 2018; Yano 2018a, 20185). 


(2 a Control: 
(30-pun-kan)  kooti-ga sensyu-o (30-pun-kan) 
30-minute-for coach-NOM player-ACC 30-minute-for 
sidoo-si-ta. 
instruct-do-PST 
*The coach instructed the player for 30 minutes." 
b. Additive type: 
(30-pun-de)  kooti-ga sensyu-o (30-pun-de) 
30-minute-in coach-NOM player-ACC 30-minute-in 
sidoo-si-ta. 
instruct-do-PST 
*The coach instructed the player in 30 minutes." 


(3 a. Control: 
(30-pun-de)  sinnyuu-syain-ga syoruio  (30-pun-de) 
30-minutein new.employee-NOM  paper-ACC 30-minute-in 
insatu-si-ta. 
print-do-PST 
*The new employee printed the papers in 30 minutes." 
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b. Subtractive type: 
(30-pun-kan) sinnyuu-syain-ga syorui-o ` (30-pun-kan) 
30-minute-for new.employee-NOM  paper-ACC 30-minute-for 
insatu-si-ta. 
print-do-PST 
“The new employee printed the papers for 30 minutes." 


Following the Latin square design, we presented 30 sentences of each condition 
to the participants. We added 60 implausible sentences as fillers (e.g., *The actress 
found a wallet for 20 minutes") to create correct NO responses to the acceptability 
judgment task performed at the end of each trial. Unlike Experiment 1, for Exper- 
iment 2, we did not manipulate the ratio of sentences, as we could examine the 
adaptation effect by comparing the results of the two experiments. 

We recruited 32 native speakers of Japanese from Kyushu University and ran- 
domly assigned them to either of the two experiments. The experiments and EEG 
analyses followed that of Experiment 1. In Experiment 2, the dependent variables 
were mean amplitudes of a 300—500 ms time window from the onset of the verbs 
(i.e., *instructed/printed"). Based on previous observations (Yano 2018b; Yano and 
Sakamoto 2016b), we included EEG data obtained from the anterior regions (Fz, 
F3/4, and F7/8). The fixed factors were COERCION (control/coercion), COERCION 
TYPE (additive/subtractive), and ITEM ORDER, with their interactions. 

This experiment focused on whether the magnitude of the early AN effect 
changed depending on the predictive power If predictive processing were to play 
a role in adaptation, we expected that the early AN effect would only decrease in 
Experiment 2A or more strikingly in Experiment 2A than in Experiment 2B. This 
effect should have surfaced as a significant interaction of COERCION and ITEM 
ORDER, especially in Experiment 2A. In contrast, the representation-based account 
does not predict such a difference due to word order because both word orders 
end up having the same aspectual interpretation; thus, the repeated exposure of 
coercion should facilitate its processing to the same extent. 


3.2 Results 


The LME model showed a significant interaction of COERCION and ITEM ORDER 
(B = -0.04, t = 1.99, p < 0.05) in addition to a marginally significant main effect of 
COERCION (à = -1.07, t = -1.82, p = 0.07) in Experiment 2A (i.e. Adv-S-O-V order). 
The three-way interaction was not significant (p » 0.10). The results suggest that the 
early AN effect faded out in both coercion types toward the end of the experiment 
(Figure 2). However, Experiment 2B (i.e. S-O-Adv-V order) did not reveal a signif- 
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Figure 2: The early anterior negativity (AN) change in the additive (left) and subtractive (right) coercion 
in Experiment 2A. 

The x-axis denotes item order, and the y-axis denotes the amplitude of the early AN in the 300-500 
ms time window. Positivity is plotted upwards. Each line shows the early AN changes estimated by 

the LME models, and each dot indicates an estimated early AN amplitude for every 20 trials. The gray 
areas refer to the 95% CL 
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Figure 3: The early anterior negativity (AN) change in the additive (left) and subtractive (right) 
coercion in Experiment 2B. 

The x-axis denotes item order, and the y-axis denotes the amplitude of the early AN in the 300-500 
ms time window. Positivity is plotted upwards. Each line shows the early AN changes estimated by the 
LME models, and each dot indicates an early AN amplitude for every 20 trials. The gray areas refer to 
the 95% CL 
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icant interaction of COERCION and ITEM ORDER (à = 0.00, t = —0.28, p > 0.10) but 
showed a significant main effect of COERCION (Ê = -0.94, t = 2.21, p < 0.05). Thus, 
the early AN effect persisted throughout the experiment (Figure 3). 


3.3 Discussion 


The results of Experiment 2A imply that the participants adapted to additive and 
subtractive types of aspectual coercion as the experiment progressed. It was not 
attributable to a simple familiarization with aspectual coercion or other factors 
that may have changed during the experiment because we noted a strikingly dif- 
ferent pattern in Experiment 2B. The difference between the two experiments 
suggests that predictive processing plays a crucial role in adaptation. In line with 
the expectation-updating hypothesis, a stronger prediction error induced a greater 
adaptation effect (Jaeger and Snider 2013; Dell and Chang 2014; cf. Chang, Dell, 
and Bock 2006). Given that the participants had much time to predict the verbs, 
the aspectually mismatching verbs induced a stronger prediction error, triggering 
an expectation update. In contrast, the representation-based hypothesis does not 
provide a simple explanation for why Japanese speakers adapt to an aspectual mis- 
match only when the prediction was strongly disconfirmed. 


4 Experiment 3: Semantic adaptation 


Experiments 1 and 2 indicate that the language-processing system flexibly updates 
its expectation about upcoming input. This finding brings up a new question of how 
adaptive the language-processing system is. As we have seen thus far, adaptation to 
linguistic environments allows for efficient language processing. However, it argu- 
ably consumes processing resources, as it requires the language-processing system 
to track how probable sentences are in an ever-changing linguistic environment. 
Given this trade-off, linguistic adaptation may somewhat be limited. For example, 
the language-processing system may not adapt to semantic violations when it does 
not forecast processing benefits for the future. 

For Experiment 3, we used semantic violations, which elicit an N400 effect, 
as well as semantically unexpected words (e.g., Kutas and Federmeier, 2000, 2011; 
Kutas and Hillyard, 1980, 1984; Lau et al. 2013, 2016; Nieuwland et al. 2020). Although 
the exact functional role of the N400 in sentence processing is a matter of debate, 
there is a consensus that it reflects lexico-semantic processing. Thus, if repeated 
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exposure to semantic violations alleviates lexico-semantic processing difficulty, the 
magnitude of the N400 effect should have decreased during the experiment. 


4.1 Method 


The sentence in (4a) is semantically natural. In contrast, the sentence in (4b) is 
semantically anomalous because the verb *naita" (cried) takes an inanimate noun 
*shikibo" (music baton) as its subject in (4b). As in Experiment 1, we manipulated 
the probability of (un)grammatical sentences with filler sentences from Experi- 
ment 1. Thus, each list included 26 sentences for each condition, plus 60 filler sen- 
tences. 


(4 a. Semantically natural sentence: 
shinseijga nai-ta. 
baby-NOM  cry-PST 
*The newborn baby cried." 

b. Semantic violation: 
"shikibo-ga nai-ta. 
baton-NOM  cry-PST 
*The baton cried." 


Twenty native speakers of Japanese from Kyushu University took part in Exper- 
iment 3. We analyzed the N400 change using 300—500 ms mean amplitudes from 
the onset of verbs, recorded on the centroparietal regions (Cz, Pz, C3/4, P3/4, P7/8, 
and 01/2). The fixed factors were PROBABILITY (low/equal), VIOLATION (seman- 
tically natural/unnatural) and ITEM ORDER of each block with their interac- 
tions. The adaptation effect should have manifested as a significant three-way 
interaction, with the interaction of VIOLATION and ITEM ORDER being greater in 
the equal-probability block than in the low probability block. 


4.2 Results 


The main effect of VIOLATION was marginally significant (6 = -1.55, t = 1.99, 
p = 0.05). Importantly, however, we did not observe a significant interaction of 
PROBABILITY, VIOLATION, and ITEM ORDER (f = -0.90, t = -1.49, p > 0.10) (Figure 4). 
Although visual inspection of Figure 4 suggests that the N400 effect was pronounced 
with increasing exposure, the interaction of VIOLATION and ITEM ORDER was not 
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significant (à = -0.31, t = -1.51, p > 0.10). Therefore, there was no evidence in favor 
of adaptation to semantic violations. 


Low Probability Equal Probability 

4 Ac. natural nomi : > 
— h 1 e 
Lo D a. HI 9 D'ass ST L 
LLI EL "7 1 w 
LL anomalous e anomalous E 

0. KS 

Item Order 


Figure 4: The N400 change in the low (left) and equal (right) probability blocks of Experiment 3. 
The x-axis denotes item order, and the y-axis denotes the amplitude of N400 in the 300-500 ms 
time window. Positivity is plotted upwards. Each line shows the N400 changes estimated by the 
LME models, and each dot indicates an N400 amplitude for every 20 trials. The gray areas refer to 
the 9596 CI. 


5 General discussion 


Unlike morphosyntactic and aspectual violations, repeated exposure to semantic vio- 
lations did not solve the processing difficulty associated with N400, which suggests 
limits to the adaptive behavior in language comprehension. The selective pattern 
implies that the language-processing system rationally decides whether to adapt to 
linguistic environments. Given that language communication usually takes place to 
exchange informative messages, it is plausible to think that people only adjust their 
expectations to what a reasonable speaker would say. For example, sentences such 
as “The rose-ACC withered” are morphosyntactically ill-formed, but the intended 
meaning is clear and semantically plausible. Aspectual coercion involves a resolv- 
able mismatch; thus, it has a coherent interpretation in the first place. In contrast, 
semantic violations used in the present study are not sentences that a reasonable 
speaker would say. The intended meaning of the utterance is unrecoverable as it 
greatly deviates from comprehenders’ expectations of what a speaker is likely to say. 
Therefore, comprehenders are less willing to adapt to semantic violations. 
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However other factors are worth considering. First, Caffarra and Martin 
(2019) showed that native Spanish speakers exhibited a P600 effect for infrequent 
errors non-native speakers produced (subject-verb number disagreement) but not 
for frequent errors (gender disagreement), suggesting that the typicality of errors 
affects adaptation. In this case, semantic violations are far less typical than mor- 
phosyntactic and aspectual violations, thus requiring more solid evidence for the 
language-processing system to change the expectation (see also Nieuwland and Van 
Berkum 2006). 


6 Conclusion 


We investigated the flexible aspects of the language-processing system using ERPs. 
The findings suggest that native speakers of Japanese can rapidly adjust their 
expectations for morphosyntactic and aspectual violations. However, we found 
no evidence for adaptation to semantic violations. We argue that these differences 
reflect comprehenders' expectations of a message that a speaker is (un)likely to 
convey or the typicality of errors. As linguistic communication involves a joint 
activity between different interlocutors who have more or less different linguistic 
knowledge and preferences, linguistic adaptation works as a mechanism for coop- 
erative alignment from the comprehenders' side. 
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Jun Nakajima and Shinri Ohta 

Chapter 8 

(Dis)similarities between semantically 
transparent and lexicalized nominal 
suffixation in Japanese: An ERP study using 
a masked priming paradigm 


1 Introduction 


Two representative models have been proposed in psycholinguistic literature for 
the lexical processing of morphologically complex words: dual- and single-route 
models. The dual-route model presupposes that morphologically complex words 
are processed by two distinct mechanisms: rule and memory (Pinker and Prince 
1988; Pinker 1999). The former mechanism (i.e., rule) explains the processing of 
complex words with regularity or productivity. For instance, an English regular 
past-tense verb *worked," composed of two morphemes of the verb stem (work) and 
suffix (Ced), is recognized as the past-tense suffix attached to its stem. Conversely, 
the latter mechanism (i.e., memory) explains complex words that are irregular or 
unproductive (Stockall and Marantz 2006). For example, an English irregular past- 
tense verb “fell,” which does not have an apparent verb stem and past-tense suffix, 
is assumed to be memorized in the mental lexicon and recognized as a whole word 
form. Relative to the word fell, worked has an expectable rule-based form and its 
suffix -ed is productive in that it can attach to any other new verbs. Unlike sentence 
processing based on syntactic rules, word processing has characteristic features of 
regularity and productivity. Thus, the dual-route model is considered a plausible 
theory, supported by behavioral (Pinker 1991) and neurolinguistic studies (Pinker 
and Ullman 2002; Marslen-Wilson 2007). 

Building on the dual-route model, a neurolinguistic study by Hagiwara et al. 
(1999) claims that this model can be applied to two types of de-adjectival nouns in 
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Japanese (-sa noun: yowa-sa, weakness; -mi noun: yowa-mi, weak point). As with 
English examples, the former -sa noun is extremely productive and semantically 
transparent. However, the latter -mi noun is unproductive in that the nominal 
suffix -mi only attaches to a small set of adjectives with a monomorphemic stem 
(hence, we call these stems -mi type stems). Further, given the lexicalized meaning 
of -mi nouns, it is difficult to expect their meanings from their corresponding adjec- 
tives. In their studies, Hagiwara et al. (1999) reported that agrammatic Broca's 
aphasic patients showed challenges in producing -sa nouns, with little difficulty 
in -mi nouns. The other type of aphasic patients, especially the patients of Gogi 
(word-meaning) aphasia, showed the opposite pattern (i.e., challenges in produc- 
ing -mi nouns). From the results, they concluded that these differences (or dissim- 
ilarities) reflected the two neurologically independent rule- and memory-based 
word-processing mechanisms, where the representation in the mental lexicon of 
-sa nouns is a stem and a suffix, while that of -mi nouns is a whole word Oe, not 
decomposed into morphemes). They claimed that agrammatic aphasics showed 
impaired processing of -sa nouns because they cannot use a grammatical rule to 
combine a stem and a suffix. Despite their grammatical impairment, the patients 
could process -mi nouns, which are lexicalized and whose meaning can be under- 
stood without requiring the grammatical rule. 

However, neurolinguistic studies on healthy people reported evidence for the 
single-route model, in which complex words are processed by a common mech- 
anism regardless of their regularity and productivity. Two components of cor- 
tical activation have been investigated as hallmarks of how complex words are 
processed (i.e., M350 and M170) (M stands for magnetic). The M350 component is 
generated around 300—500 ms after the word onset in the middle temporal gyrus. 
The M350 comprises two subcomponents, at least, which reflect lexical access to 
morphemes and the recombination of morphemes (Pylkkänen and Marantz 2003). 
The other component, M170, is generated around 150-200 ms after the word onset 
in the left fusiform and inferior temporal gyri. It is considered that this component 
reflects the automatic decomposition of complex words. Using magnetoencepha- 
lography (MEG), Solomyak and Marantz (2010) reveal that the effect of lemma tran- 
sition probability (TP) (probability of a suffix from the stem of a word) modulates 
the M350 component, while the surface frequency of a word does not affect it. From 
these results, they conclude that this activity reflects the processing of lexical access 
at the morpheme level but not the whole word level. In another MEG study, based 
on analyses of the M170 Oe, an MEG component reflecting morphological decom- 
position), Fruchter, Stockall, and Marantz (2013) report larger activation for irreg- 
ular past-tense forms than monomorphemic words (pseudo-irregular words); they 
conclude that the regular and irregular verbs are decomposed into morphemes in 
the same manner These neurolinguistic studies suggest a common neural mecha- 
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nism for the lexical processing of morphologically complex words, irrespective of 
the regularity or productivity of words. 

Further, recent behavioral studies also provide inconsistent evidence for the 
dual-route model. In behavioral studies, the masked priming paradigm is widely 
applied to investigate the visual processing of morphologically complex words. In 
this paradigm, a priming word stimulus is presented on a screen extremely rapidly, 
followed by a target word. Rastle et al. (2000) report the facilitation of reaction time 
(RT) (priming effect) for morphologically related primes and targets (e.g., teach- 
er-TEACH) but not semantically related pairs (e.g., cello-VIOLIN). Interestingly, this 
facilitation is also observed for morphologically unrelated but orthographically 
related pairs (e.g., brother-BROTH) (Rastle, Davis, and New 2004), suggesting that the 
masked priming paradigm can provide a situation where morphological effects are 
observed “in the absence of semantic effects” (Rastle et al. 2000: 517). Such priming 
effects reflect some process of decomposition that is different from morphological 
decomposition (i.e., “form-based [presemantic] processing”) (Rastle and Davis 2008; 
Morris and Stockall 2012). Moreover, Rastle, Davis, and New (2004) report that neither 
orthographically related controls (brothel-BROTH) nor orthographically similar 
pairs (boil-BROIL) showed the priming effects. Furthermore, Crepaldi et al. (2010) 
report that irregular verb pairs, suchas fell-FALL, show a similar facilitating priming 
effect as regular verb pairs, while pseudo-irregular pairs (bell-BALL) did not. Such 
behavioral studies suggest that morphologically complex words are processed by 
primarily using morphological cues Oe, identical morphemes between the prime 
and target). Orthographic cues (e.g., similarity of the letters) also play a role in the 
masked priming paradigm. However, it should be noted that orthographic cues 
alone cannot elicit the robust priming effect (e.g., brothel-BROTH). 

In an MEG study, Lewis, Solomyak, and Marantz (2011) also demonstrate that 
only morpho-orthographically complex words enhance the amplitude of the M170, 
while the orthographically related control does not (brother vs. brothel). Moreover, 
using the masked priming paradigm, Fruchter, Stockall, and Marantz (2013) report 
that the masked priming effects on M170 activity were modulated by morpho- 
orthographically complex pairs and irregular verb pairs but not pseudo-irregular 
pairs (jumped-JUMP, fell-FALL vs. bell-BALL). Together, these results demonstrate 
that the masked priming paradigm and M170 allow for revealing the decomposition 
of morpho-orthographic properties for complex words (see also Gwilliams 2020). 

In the event-related potentials (ERP) literature, the N400 and N170 are con- 
sidered the electrophysiological analogs of the M350 and M170, respectively (e.g., 
Pylkkanen and Marantz 2003). The N400, reflecting lexical processes, is generated 
approximately 300—500 ms after the onset of words, while the N170, which reflects 
the automatic decomposition of complex words, is generated approximately 
150-200 ms after. In an ERP study employing masked priming, Lavric, Clapp, and 


136 — s Jun Nakajima and Shinri Ohta 


Rastle (2007) report that morphologically related prime-target pairs attenuated 
the peak of the N400 in the parieto-occipital area, while orthographically related 
prime-target pairs did not. In an MEG study employing the masked priming para- 
digm, Fruchter, Stockall, and Marantz (2013) also report a difference in M350 cor- 
tical activation between morphologically related and unrelated pairs. The studies 
show that morpho-orthographically related prime-target pairs lighten the process- 
ing load of the target given the shared morpheme of these pairs being activated 
through priming. Hence, employing the masked priming paradigm in ERP studies 
is an effective way to investigate the processing of morphologically complex words. 
RTs and the N170 are indicators of morpho-orthographical decomposition. More- 
over, the N400 will reflect semantic aspects of processing target complex words, 
which cannot be investigated by behavioral or N170 data. 

A recent behavioral masked priming study on Japanese de-adjectival nouns 
for healthy adults shows robust masked priming effects on -sa and -mi nouns, sug- 
gesting that decomposition occurs while processing the two types of derived nouns 
(Clahsen and Ikemoto 2012). However, unlike many English studies on the mecha- 
nism of word processing, neurophysiological evidence has not yet been reported 
for Japanese. 

Using electroencephalography (EEG), this chapter investigates the neural 
mechanism for processing complex words in the Japanese de-adjectival nouns by 
examining the ERP components, N400 and N170, crucial for lexical processing in 
the brain. We employ the masked priming paradigm to test the neural mechanism 
of morpho-orthographic and semantic aspects by behavioral and electrophysiolog- 
ical evidence. While it is beyond this study’s scope to propose a comprehensive 
model of lexical processing of morphologically complex words (Embick, Creemers, 
and Goodwin Davies 2021), the primary aim is to more broadly demonstrate neuro- 
linguistic evidence of processing of the two types of Japanese de-adjectival nouns. 
Based on recent neurolinguistic and behavioral studies, we test the hypothesis that 
the two types of Japanese de-adjectival nouns are processed by common neural 
mechanisms, especially in an earlier stage (i.e., morphological decomposition), pro- 
posed in prior neuroimaging studies in English past-tense verbs. With this hypothe- 
sis, the left-lateralized N170 will be observed in both de-adjectival nouns. Moreover, 
behavioral data for these de-adjectival nouns will show significant priming effects, 
and the N400 will be attenuated when the prime-target pairs have common stems 
(i.e., priming effects on the N400). Given their lexicalized meanings, we further 
hypothesize that -mi nouns will show a larger N400 peak than -sa nouns in the later 
stage, which reflects lexical access and recombination of morphemes. 

This study focuses on following four prime-target conditions of -mi type stems 
to elucidate the morphological processing of transparent or lexicalized derived 
nouns: Related condition of -sa nouns (e.g., yowa-i > yowa-sa, weak ^ weakness), 
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unrelated condition of -sa nouns (e.g., tura-i ^ yowa-sa, stressful ^ weakness), 
related condition of -mi nouns (e.g., yowa-i ^ yowa-mi, weak > weak point), and 
unrelated condition of -mi nouns (e.g., tura-i ^ yowa-mi, stressful ^ weak point). 
Previous priming studies used more complex or derived words as primes (e.g., 
teacher-TEACH). However, this study uses derived nouns as targets because we 
want to examine behavioral and ERP data of the two types of de-adjectival nouns 
in Japanese, not the adjectives themselves. Moreover, to examine whether priming 
effects in behavioral and ERP data are independent of the conflict between the two 
types of de-adjectival nouns (-sa and -mi nouns), we include additional de-adjecti- 
val nouns that cannot derive -mi nouns (“non-mi type": e.g., tura-sa, stressfulness; 
subaya-sa, quickness) as stimuli. 


2 Materials and methods 
2.1 Participants 


We recruited 19 right-handed native speakers of Japanese (seven males, mean * 
standard deviation [SD]: 22.1 + 2.2 yrs.). All participants provided written informed 
consent before participating in the study. One participant was excluded because 
of excessive artifacts in the EEG data; thus, 18 datasets were used for analysis 
(seven males, 22.0 * 2.2 yrs.). All 18 participants showed right-handedness (lateral- 
ity quotient: 88.9 + 13.6), as determined by the Edinburgh Handedness Inventory 
(Oldfield 1971). All participants had normal or corrected-to-normal vision and no 
history of neurological or psychiatric disease. Approval for the experiments was 
obtained from the institutional review board of the Department of Linguistics, 
Faculty of Humanities, Kyushu University. 


2.2 Stimuli 


Based on adjectival stem frequencies, we selected 78 prime words from *Balanced 
Corpus of Contemporary Written Japanese" (BCCWT, National Institute for Japanese 
Language and Linguistics, https://ccd.ninjal.ac.jp/bccwj/) (Maekawa et al. 2014). 
We created 156 target words by replacing the last syllable with -sa or -mi. The 
prime-target pairs were categorized into three groups, characterized by their stems 
(26 pairs each): -mi type stem (with a simple stem), non-mi type with a simple stem, 
and non-mi type with a complex stem. The -mi type stem condition allows for two 
types of derivation (e.g., yowa-i ^ yowa-sa, yowa-mi), while the non-mi type stems 
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do not allow -mi suffixation (e.g., tura-i > tura-sa/*tura-mi, stressful — stressfulness; 
subaya-i ^ subaya-sa/*subaya-mi, quick > quickness). We created 156 nonce words 
by changing the last syllable(s) of the adjective -i to -si-i, or -si-i to -i (e.g., yowa-i 
(weak) > *yowasi-i; kurusi-i (distressed) ^ *kuru-i). We presented these stimuli 
(primes and targets) written in a combination of kanji (Chinese characters) and 
hiragana (Japanese syllabic characters), except for the following cases, which may 
elicit unexpected orthographic effects (see the Appendix for the list of the adjec- 
tives used). First, to avoid the ambiguity of reading, the word “># &,” which could 
be read as kara-sa (spiciness) or tura-sa (stressfulness), was represented as “2> ò 
è” and “O ò &.” Second, phonetic equivalents, kanji used as phonetic symbols 
rather than for their meanings, were also presented in hiragana. For example, 
“iii &” was shown as “# $ L 4 ë” (omosiro-sa, interest). The same things 
applied to the following words: *13 aL è” (JÆ L ë okasi-sa, fun), “P12 ë ” 
(ISS, kawai-sa, cuteness), and “K £ F ë” (IS, kimazu-sa, unpleasant- 
ness). Further, the words whose corresponding nonce words can be read as exist- 
ing Japanese words were displayed in hiragana. For example, the corresponding 
nonce words of “7 L. &” was represented as “74 ë ,” which can be read as niga-sa 
(bitterness). Thus, this word was represented in hiragana (* < 2 L &,” kurusi-sa, 
distress). The same things applied to the following words: “Z 73 &" (74 è , niga-sa, 
bitterness), “HEL ar (EL ë , muzukasi-sa, difficulty), and “p 4 &" GES, waru- 
sa, badness). Lastly, the word “& 4 &” (toro-sa, stupidity/slowness) was displayed 
in hiragana, as this word cannot be written using kanji. Moreover, the unrelated 
condition of prime-target pairs with morphological relationships was excluded 
from stimuli (e.g., =- -3 & , haya-i and subaya-sa, early-quickness). 

We compared the effects of length, surface frequency, and TP on ERP compo- 
nents as variables to examine the effects of word form and lexical properties on 
brain activity. The length of a word was defined by the number of letters. The mean 
values of these variables across the two conditions (-sa nouns and -mi nouns) did 
not show any significant differences (Table 1). Following Solomyak and Marantz 
(2010), we defined the lemma TP between the stem and suffix of each target word 
as follows: In calculating TP, Freq represents the lemma frequency of word form: 
TP(stem suffix) = P(stem + suffix) / P(stem) oc Freq(stem + suffix) / Freq(stem). For 
instance, the calculation of TP (yowa  -mi) is represented as follows: P(yowa-mi) / 
P(yowa-) œ Freq(yowa-mi) / Freq(yowa-). We used the BCCWJ to calculate the lemma 
frequencies. For the following analyses, we used log-transformed frequencies and 
TP to reduce the skewness of the data distribution. 


1 Two kanji, “fii” and “A, " represent face/mask and white, respectively. 
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Table 1: Summary statistics for experimental stimuli. 


Statistics -mi type stems t values p-values 
-Sa nouns -mi nouns 

Log frequency (Prime) 3.57 + 0.64 3.57 + 0.64 N.A. N.A. 

Length (Target) 2.54 + 0.80 2.54 + 0.80 NA NA 

Log stem frequency (Target) 2.26 + 0.08 2.54 + 0.35 t(25) = 2.72 41 

Log transition probability (Target) -1.44 + 0.26 -1.75 t 0.22 t(25) = 2.72 10 


Data are shown as mean + standard deviation. A summary of two-tailed paired t-test between -sa 
nouns and -mi nouns. Note that the paired t-tests were not applicable (N. A.) for “log frequency 
(prime)" and *length (target)" because the same set of primes was used for -sa and -mi nouns, and 
the length of -sa and -mi nouns was always equivalent. 


2.3 Procedures 


We conducted a visual lexical decision task with masked priming, using PsychoPy 
3.0.7, a psychophysics software program written in Python (Peirce 2007). The 
displayed fonts were Meiryo and Arial for Japanese characters and other letters, 
respectively. Each trial started with a fixation cross (“+”) at the center of the screen 
for 500 ms, followed by a blank screen for 500 ms. A string of hash marks for 500 
ms (*HHHP) followed the blank screen, which served as a forward mask for the 
incoming prime word. The adjective prime word (e.g., 55 vs, “yowa-i,” weak) was 
presented after the string of hash marks and remained for 48 ms; it was followed 
by the target word (e.g., 55 &, *yowa-sa," weakness; 354, *yowa-mi," weak point) 
about which the participant was required to make a lexical decision. The target 
word was retained until the participant responded and was then replaced by a var- 
iable blank serving as an inter-trial interval. The inter-trial interval varied between 
600—1200 ms. The primes and targets were presented in a random order for each 
participant (Figure 1). 


"H Hem Fee He HC 


Fixation Blank Hash marks Prime Target Inter-trial Interval 
(500 ms) (500 ms) (500 ms) (48 ms) (Button Press) (600-1200 ms) 


Figure 1: Experimental design using the masked priming paradigm. The participants performed a 
visual lexical decision task on the target words by pressing buttons. 
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During the experiment, the participants were seated in an electromagnetic shield 
room with a monitor positioned approximately 130 cm in front of them. They were 
instructed to judge whether the displayed letter strings were words by pressing 
buttons. The lexical decision task was preceded by 18 practice trials, which included 
feedback. Each participant took a short break approximately every five minutes 
during the experiment. 


2.4 Electroencephalography acquisition 


Weacquired the EEG by fitting the participants with an elastic cap with 64 embedded 
Ag/AgCl electrodes (actiCAP, Brain Products; Neurofax EEG-1200, Nihon Kohden). 
Two additional AgCl electrodes were placed below and to the left of the left eye 
to monitor vertical and horizontal eye movements. The linked earlobes served as 
a reference. All electrode impedances were maintained below 45 kQ throughout 
the experiments. The ERPs were amplified with a bandpass filter of 1-30 Hz and 
sampled at 1,000 Hz. 


2.5 Data analyses 


We used repeated-measures analyses of variance (rANOVAs) for behavioral and 
ERP data analyses. We considered a corrected p-value of less than 0.05 to be sta- 
tistically significant in the following analyses. We applied the Greenhouse-Geisser 
correction for all effects involving more than one degree of freedom (Greenhouse 
and Geisser 1959, Picton et al. 2000) and corrected the multiple comparisons using 
Shaffer’s modified sequentially rejective Bonferroni procedure (Shaffer 1986). For 
significant main effects with more than two levels, we conducted post-hoc pairwise 
comparisons using paired t-tests. We also ran simple effects tests for significant 
interactions. We reported the corrected degrees of freedom, corrected p-values, 
and effect sizes (Generalized eta squared [nç] for ANOVA and Cohen’s d for t-tests). 


2.6 Behavioral data processing 


We excluded trials with incorrect responses and results exceeding +2.5 S.D. from 
the average (approximately 8.4% of the data) (Baayen 2008). We applied this exclu- 
sion procedure to the behavioral data. We conducted two independent analyses 
for the accuracy and RTs: A two-way rANOVA (Prime Type [Related and Unre- 
lated] x Suffix [-sa and -mi]) for -mi type stem (e.g., yowa-sa, yowa-mi) to examine 
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the priming effect of different de-adjectival nouns, and an additional two-way 
rANOVA (Prime Type [Related and Unrelated] x Stem Type [-mi type stem, 
non-mi type with a simple stem, and non-mi type with a complex stem]) for -sa 
nouns (e.g., yowa-sa, tura-sa, and subaya-sa; weakness, stressfulness, and quick- 
ness, respectively) to compare the priming effect of different stem types. 


2.7 Event-related potentials data processing 


We used MNE-Python for the ERP data analysis (Gramfort et al. 2013) and applied 
independent component analysis to de-noise the eyeblinks and ECGs. We computed 
ERP averages for a 600 ms time window under all conditions. The baseline was set 
to 100 ms before the stimulus onset. We excluded trials with incorrect responses 
and large artifacts exceeding 80 uV from further analysis (approximately 11.8% of 
the data). 

For the analyses of the N400, epoched ERPs were averaged for the following 
seven regions of interest (ROIs): front central (FC: Fp1, Fp2, AF3, AFz, AF4, F1, 
Fz, F2), central (C: FC1, FC2, C1, Cz, C2, CP1, CPz, CP2), left frontal (LF: AF7, F7, F5, 
F3, FT9, FT7, FC5, FC3), right frontal (RF: AF8, F8, F6, F4, FT10, FT8, FC6, FC4), left 
temporal (LT: T7, C5, C3, TP9, TP7, CP5, CP3, P7, P5, P3), right temporal (RT: T8, C6, 
C4, TP10, TP8, CP6, CP4, P8, P6, P4), and parieto-occipital (PO: P1, Pz, P2, PO7, PO3, 
Pz, PO4, POS, O1, Oz, O2, Iz) (Figure 2A). Following other masked priming studies 
of the N400/M350, the time window for the N400 was restricted to 300-500 ms 
(Morris and Stockall 2012; Fruchter, Stockall, and Marantz 2013). We ran a three- 
way rANOVA (Prime Type [Related and Unrelated] x Suffix [-sa and -mi] x ROIs 
[FC, LF, C, RF, LT, RT, and PO]) on the mean amplitude of the time window of the 
N400 to identify the ROIs in which the N400 differed between related and unre- 
lated conditions. When significant main effects or interactions were found, we 
performed follow-up two-way rANOVA to assess the presence of priming effects in 
each condition. 

For the analyses of the N170, epoched ERPs were averaged for the following 
five ROIs on the left, following Lavric, Clapp, and Rastle (2007): anterior frontal (aF: 
Fp1, AF3, AF7, F1, F3, F5, F7), posterior frontal (pF: FC1, FC3, FC5, FT7, FT9, C1, C3, 
C5), temporal (T: T7, CP5, TP7, TP9, P7), parietal (P: CP1, CP3, P1, P3, P5), and pari- 
eto-occipital (PO: PO3, PO5, O1). We also analyzed the corresponding regions on 
the right, while midline electrodes were not analyzed (Figure 2B). Following other 
studies of the N170/M170 (Morris, Grainger, and Holcomb 2013; Fruchter, Stockall, 
and Marantz 2013), the time window for the N170 was restricted to 150—200 ms. 
We conducted a two-way rANOVA (Laterality [Left and Right] x ROIs [aF, pF, T, 
P, and PO]) for the average amplitude of the N170 time windows to identify the 
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Figure 2: Regions of interest (ROIs) for the electroencephalography analyses. (A) ROIs for the N400. 
FC, fronto-central; C, central; LF, left frontal; RF, right frontal; LT, left temporal; RT, right temporal; 
PO, parieto-occipital. (B) ROIs for the N170. aF, anterior frontal; pF, posterior frontal; T, temporal; 

P, parietal; PO, parieto-occipital. We analyzed the corresponding regions on the right, while midline 
electrodes were excluded from the analyses. 


generation source of the N170. When we found significant main effects or interac- 
tions between ROIs and laterality, we performed a follow-up rANOVA (Prime Type 
[Related and Unrelated] x Suffix [-sa and -mi]) to assess the presence of priming 
effects in each condition. 

We conducted mixed-effects model analyses to examine the influence of indi- 
vidual target variables (i.e., TP, frequency, and length) on the moving average of a 
5 ms change in millisecond-level amplitude over the time window of the N400. We 
used the Imer function in the Ime4 package (Bates et al. 2015) and the clusterperm. 
Imer function in the permutes package of R (Voeten 2021). 


Amplitude ~ length + TP + (1| Subject) 
Amplitude ~ length + frequency + (1 | Subject) 


We used the above maximal feasible models, that is, the maximal model that is still 
capable of converging (Bates et al. 2018; Voeten 2021). Importantly, as TP is always 
less than one, log-transformed TP, which was used in linear mixed-effects mode- 
ling, is always a negative value (see 2.2 Stimuli for the definition of TP). Moreover, 
as the N400 is a negative-going potential, a positive regression coefficient (beta) in 
linear mixed-effects models for the TP indicates a larger N400, while a negative 
beta indicates a smaller N400. 

We conducted two mixed-effects model analyses of ROI amplitude, with length 
and TP or frequency of individual targets as fixed factors and subjects as random 
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factors. We used two formulas for the mixed-effects model analysis because of the 
high correlation between TP and frequency: one formula included TP and length as 
fixed effects, and the other formula included frequency and length as fixed effects. 
Considering the numerous regressions computed in the analysis, we performed a 
multiple comparison correction procedure (a permutation test following Maris and 
Oostenveld 2007) on the continuous moving average amplitude of the point-by-point 
regressions. We computed the beta values of fixed effects for 10,000 random permu- 
tations to identify temporal clusters in which the target variables were significantly 
correlated with the amplitude. We conducted this analysis separately for Prime Type 
Oe, related and unrelated conditions) to probe the effect of priming on the ERPs. 


3 Results 
3.1 Behavioral results 


Table 2 summarizes the behavioral data. The mean accuracy of the lexical deci- 
sion task was 91.5%, indicating that the participants correctly performed the task. 
Regarding the accuracy, a two-way rANOVA (Prime Type x Suffix) showed a signif- 
icant main effect of suffix for the -mi type stem [Suffix: F(1, 17) = 29.06, p « .0001, 
nc = .31], suggesting that the participants had more difficulty with -mi nouns in 
the lexical decision task (Table 3). Conversely, the main effect of prime type and 
interaction was not significant [Prime Type: F(1, 17) - 2.46 p - .14, n?; - .010; Inter- 
action: F(1, 17) = 0.069, p = .80, n?; = .0004]. 


Table 2: Behavioral data for stem types and conditions. 


-mi type stem -mi type stem Non-mi type with ` Non-mi type with 
(-sa nouns) (-mi nouns) asimple stem a complex stem 
Leg nouns) (-sa nouns) 
Accuracy RTs Accuracy RTs Accuracy RTs Accuracy RTs 
(96) (ms) 


Related 96.4£0.9 665425 86.1:2.0 735432 95.3414 674425 86.88:28 788*32 
condition 


Unrelated 95.1+1.3 757431 84.2+42.8 793434 94.0+1.0 748527 81.44+3.0 846+32 
condition 


Priming 1.3 -92 1.9 -58 1.3 -74 5.4 -58 
effect 


Data are shown as mean + standard error of the mean (SEM). The priming effect shows the subtraction 
of the mean accuracy and reaction times (RTs) between the unrelated and related conditions. 
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An additional two-way rANOVA (Prime Type x Stem Type) on the accuracy 
showed significant main effects of prime and stem types [Prime Type: F(1, 17) - 
11.87, p = .0031, n2ç = .027; Stem Type: F(1.21, 20.6) = 17.74, p = .0002, n^; =.31], while 
the interaction between prime and stem types was marginally significant [Interac- 
tion: F(1.77, 30.03) = 2.92, p = .076, n?; = .014] (Table 3). Multiple comparisons for 
the stem types showed that the non-mi type with a complex stem showed lower 
accuracy than other stem types [-mi type vs. non-mi type with a complex stem: 
t(17) = 4.26, corrected p = .0009, Cohen’s d = 1.24; Non-mi type with a simple stem 
vs. non-mi type with a complex stem: t(17) = 4.53, corrected p = .0009, Cohen's d= 
1.12], whereas the -mi and non-mi types with a simple stem showed no significant 
difference in accuracy [t(17) = 1.02, p = . 32, Cohen’s d = 0.22]. Post-hoc simple effects 
analyses for the significant interaction between the prime and stem types showed 
significant priming effects for the non-mi type with a complex stem but not for the 
other stem types [-mi type: F(1, 17) = 1.31, p = 27, n’¢ = .019; Non-mi type with 
a simple stem: F(1, 17) = 1.31, p = .27, n?; = .016; Non-mi type with a complex 
stem: F(1, 17) - 9.48, p - .0068, n?; - .048]. Multiple comparisons for the stem types 
under the related and unrelated conditions showed that the non-mi type with a 
complex stem was more challenging than other stem types [Related: -mi type vs. 
non-mi type with a complex stem: ¢(17) = 3.66, corrected p = .0058, Cohen's d = 
1.10; Non-mi type with a simple stem vs. non-mi type with a complex stem: 
t(17) = 3.45, corrected p = .0058, Cohen’s d = 0.92; Unrelated: -mi type vs. non-mi 
type with a complex stem: ¢(17) = 4.48, corrected p = .0006, Cohen's d = 1.39; 
Non-mi type with a simple stem vs. non-mi type with a complex stem: t(17) = 
4.72, corrected p = .0006, Cohen's d = 1.33], whereas the -mi and non-mi types with 
asimple stem showed no significant difference [Related: -mi type vs. non-mi type 
with a simple stem: t(17) = 0.75, p = .46, Cohen’s d = 0.21; Unrelated: -mi type vs. 
non-mi type with a simple stem: ¢(17) = 0.72, p = .48, Coher's d = 0.22]. 


Table 3: Analysis of variance results of accuracy. 


Factors df F p He 
Prime Type x Suffix 

Prime Type 1,17 2.46 14 .010 
Suffix 117 29.06 «0001  .31 
Interaction 1,17 0.069 .80 .0004 
Prime Type x Stem Type 

Prime Type 117 11.87 .0031 .027 
Stem Type 1.21, 20.6 17.74 .0002  .31 


Interaction 1.77, 30.03 2.92 .076 .014 


df, degrees of freedom; te, generalized eta squared. 
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Regardless of the target conditions, the RTs were significantly facilitated when 
the morphologically related prime preceded the target (Table 2). For a two-way 
rANOVA (Prime Type x Suffix) of -mi type stem, we found facilitating priming 
effects for the related conditions [Prime Type: F(1, 17) = 70.11, p < .0001, n?; = .080] 
(Table 4). Additionally, we found the significant main effect of suffix [Suffix: F(1, 17) - 
30.92, p « .0001, n?; = .041], showing that the RTs of -mi nouns were significantly 
longer than those of -sa nouns. However interaction between the prime types and 
suffixes was not significant [Interaction: F(1, 17) = 2.23, p = .15, nc = ,0042]. These 
results demonstrated that both types of nouns showed a morphological priming 
effect, and the effects between the two types of nouns were equivalent. 


Table 4: ANOVA results of reaction times. 


Factors df F p He 
Prime Type x Suffix 

Prime type 1,17 70.11 «.0001 .080 
Suffix 1,17 30.92 «.0001 .041 
Interaction 1,17 2.23 .15 .0042 
Prime Type x Stem Type 

Prime Type 1,17 160.2 <.0001 .090 
Stem Type 1.87, 31.8 105.0 <.0001 45 


Interaction 1.88, 32.03 1.91 47 .0032 


df, degrees of freedom; n?;, generalized eta squared. 


For the RT analysis of two-way rANOVA (Prime Type x Stem Type), robust priming 
effects were found in the related conditions [Prime Type: F(1, 17) = 160.2, p « .0001, 
nc - .090] (Table 4). We also found a significant main effect of stem type [Stem 
Type: F(1.87, 31.8) = 105.0, p < .0001, n’¢ = .15], while the interaction between the 
prime and stem types was not significant [Interaction: F(1.88, 32.03) = 1.91, p = .17, 
ns = .0032]. Multiple comparisons for the stem types showed that the non-mi type 
with a complex stem yielded longer RTs than other stem types [-mi type stem vs. 
non-mi type with a complex stem: ¢(17) = 11.91, corrected p < .0001, Cohen's d = 
0.80; Non-mi type with a simple stem vs. non-mi type with a complex stem: 
t(17) = 11.70, corrected p « .0001, Cohen’s d = 0.84], whereas the -mi type stem and 
non-mi type with a simple stem showed no significant difference for the RTs [-mi 
type stem vs. non-mi type with a simple stem: t(17) = 0.013, corrected p = .99, 
Cohen’s d = 0.0008]. 
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3.2 Event-related potentials results 
3.2.1 Significant priming effects on the N400 for -sa and -mi nouns 


The ERPs of the -mi type stem revealed attenuation of the N400 for the related con- 
dition relative to the unrelated condition in the C, LT, RT, and PO regions (Figure 3A). 
The scalp distribution of the unrelated — related contrast showed that the unre- 
lated condition was more negative in both types of nouns (Figure 3B). A three-way 
rANOVA (Prime Type x Suffix x ROIs) for the -mi type stem of the mean amplitude 
of the N400 revealed significant main effects of prime type, suffix, and ROIs [Prime 
Type: F(1, 17) = 20.96, p = .0003, n2ç = .045; Suffix: F(1, 17) = 6.73, p = .019, n?; = .011; 
ROIs: (1.74, 29.59) = 8.59, p = .0017, n?; = .13] (Table 5). The interaction between 
prime type and ROIs was also significant [Prime Type x ROIs: F(2.48, 42.24) = 
28.28, p « .0001, n?; = .025]. However, the other interactions were not significant 
[Prime Type x Suffix: F(1, 17) = 0.11, p = .74, n’¢ = .0002; Suffix x ROIs: F(1.91, 
32.54) = 1.17, p = .32, n’¢ = .0013; Prime Type x Suffix x ROIs: F(2.92, 49.68) = 1.91, 
p = 44, n’¢= 0010]. 

To determine conditions for further analyses, simple effects tests for signif- 
icant interaction (e, prime type and suffix) were conducted. The effect of ROIs 
was significant regardless of prime type [Related: F(1.87, 31.82) = 13.66, p = .0001, 
nes .19; Unrelated: F(1.7, 28.88) = 5.06, p = .017, gie = .096]. Prime type was 
significant only in the C, LT, RT, and PO regions [C: F(1, 17) = 22.10, p = .0002, n2ç = 
.059; LT: F(1, 17) = 24.27, p = .0001, n2ç = .097; RT: F(1, 17) = 23.69, p = .0001, n2ç = .066; 
PO: F(1, 17) = 61.41, p < .0001, n?; = .12]; it was not significant in the FC, LF, and RF 
[FC: F(1, 17) = 0.73, p = .40, n2ç = .0032; LF: F(1, 17) = 2.85, p = .11, n’¢ = .010; RF: POL 
17) = 1.17, p = .29, n’¢ = .0050]. 

We conducted follow-up two-way rANOVAs (Prime Type x Suffix) for each of 
the C, LT, RT, and PO regions, which showed significant priming effects. Beyond the 
significant priming effects, we found a significant main effect of suffix in the C and 
LT [C: F(1, 17) = 7.30, p = .015, n7; = .015; LT: F(1, 17) = 5.02, p = .039, n7; = .017] (Figure 
3C and Table 5). However, the RT and PO regions did not show a significant effect of 
suffix [RT: F(1, 17) = 1.26, p = .28, n’¢ = .0043; PO: F(1, 17) = 1.81, p = .20, n? = .0044]. 
These results suggest that semantic processing regarding the lexicalized meaning 
of -mi nouns elicited the larger N400 in the C and LT regions. Interaction was not 
significant in all regions [C: F(1, 17) = 0.0066, p = .94, n?; « .0001; LT: F(1, 17) = 0.48, 
p = .50, n; = .0013; RT: F(1, 17) = 0.20, p = .66, n2ç = .0003; PO: F(1, 17) = 0.52, p = 48, 
ns = 0006]. 

We further examined whether ERPs in the C, LT, RT, and PO regions for the 
non-mi types showed similar attenuation of the N400 for related conditions. 
Using two-way rANOVAs (Prime Type x ROIs), we found that the related condi- 
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Figure 3: Significant priming effects on the N400. (A) Event-related potentials of the -mi type. Blue 
and red lines represent the unrelated and related conditions, respectively. Solid and dashed lines 
represent the -sa and -mi nouns, respectively. The rectangle areas (300-500 ms) indicate the time 
window of the N400. FC, fronto-central; C, central; LF, left frontal; RF, right frontal; LT, left temporal; 
RT, right temporal; PO, parieto-occipital. (B) The scalp distribution of the unrelated - related. 

(C) The amplitudes of the N400 (mean + SEM). Filled and open bars denote the related and unrelated 
conditions, respectively. We applied the Greenhouse-Geisser correction for rANOVA and Shaffer's 
modified sequentially rejective Bonferroni procedure for the post-hoc tests. *corrected p « .05, 
**corrected p < .01, ***corrected p < .001. 
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Table 5: ANOVA results of the N400. 
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Factors df F p He 
Prime Type x Suffix x ROIs 

Prime Type 117 20.96 .0003 .045 
Suffix 117 6.73 .019 .011 
ROIs 1.74, 29.59 8.59 .0017 43 
Prime Type x Suffix 1,17 0.11 .74 .0002 
Prime Type x ROIs 2.48, 42.24 28.28 <.0001 .025 
Suffix x ROIs 1.91, 32.54 1.17 32 .0013 
Prime Type x Suffix x ROIs 2.92, 49.68 1.91 44 .0010 
C: Prime Type x Suffix 

Prime Type 117 22.10 .0002 .059 
Suffix 117 7.30 .015 .015 
Prime Type x Suffix 1,17 0.0066 94 <.0001 
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Table 5 (continued) 


Factors df F p He 
LT: Prime Type x Suffix 

Prime Type 1,17 24.27 .0001 .097 
Suffix 1,17 5.02 .039 .017 
Prime Type x Suffix 1,17 0.48 .50 .0013 
RT: Prime Type x Suffix 

Prime Type 1,17 23.69 .0001 .066 
Suffix 1,17 1.26 .28 .0043 
Prime Type x Suffix 1,17 0.20 .66 .0003 
PO: Prime Type x Suffix 

Prime Type 1,17 61.41 <.0001 12 
Suffix 1,17 1.81 .20 .0044 
Prime Type x Suffix 1,17 0.52 EI .0006 


df, degrees of freedom; n’g, generalized eta squared. 


tion showed the attenuated peak of the N400 for the non-mi type stems [Non-mi 
type with a simple stem: F(1, 17) = 21.19, p = .0003, n?; = .10; Non-mi type with a 
complex stem: F(1, 17) - 21.64, p - .0002, nc = .14] (Figure 4). Moreover, post-hoc 
comparisons between the related and unrelated conditions in each ROI showed sig- 
nificant priming effects in all regions [Non-mi type with a simple stem: C: t(17) = 
3.26, corrected p = .0046, Cohen's d = 0.77; LT: t(17) = 4.40, corrected p = .0011, 
Cohen’s d = 1.04; RT: t(17) = 3.52, corrected p = .0052, Cohen’s d = 0.83; PO: t(17) = 
5.34, corrected p < .0001, Cohen’s d = 1.34; Non-mi type with a complex stem: C: 
t(17) = 2.38, corrected p = .029, Cohen’s d = 0.56; LT: t(17) = 3.56, corrected p = .0073, 
Cohen’s d = 0.84; RT: t(17) = 3.34, corrected p = .0078, Cohen’s d = 0.79; PO: t(17) = 
3.78, corrected p = .0060, Cohen’s d = 0.89]. 


3.2.2 Attenuation of the N400 by transition probability 


The linear mixed-effects model analyses revealed that the model including the TP 
between the stem and suffix as a fixed effect showed significant negative correla- 
tions for the unrelated conditions in two different time windows (Figure 5A). The 
correlation between the N400 amplitude and TP was significant in the C, LT, and 
PO regions around 330-370 ms, while at approximately 440—460 ms, the correla- 
tion was significant in the C, LT, RT, and PO regions (corrected p « 0.05). Given the 
negative regression coefficient (beta) in these regions, the results show that the TP 
between morphemes attenuated the amplitude of the N400, which is a crucial ERP 
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Figure 4: Event-related potentials (ERPs) of the non-mi types. (A) ERPs of the non-mitype with a 
simple stem and (B) ERPs of the non-mi type with a complex stem. Solid and dashed lines represent 
the unrelated and related conditions, respectively. The rectangle areas (300-500 ms) indicate the time 
window of the N400. FC, fronto-central; C, central; LF, left frontal; RF, right frontal; LT, left temporal; RT, 
right temporal; PO, parieto-occipital. We applied the Greenhouse-Geisser correction for rANOVA and 
Shaffer's modified sequentially rejective Bonferroni procedure for the post-hoc tests. *corrected p « .05, 
**corrected p « .01, ***corrected p « .001. 
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component for lexical access to morphemes and recombination of morphemes (see 
2.7 Event-related potentials data processing for the relationship between the N400 
and regression coefficient of TP). However, no significant correlations were found 
for the related condition (Figure 5B). We further tested another model including 
frequency as a fixed effect, which did not show a significant correlation in any 
regions or conditions throughout the N400 time window (300—500 ms). 


3.2.3 Significant laterality effects on the N170 


Among all four conditions, the scalp distribution of the N170 showed a lower 
amplitude in the left hemisphere than in the right hemisphere, indicating that 
the N170 component was localized in the left hemisphere (Figure 6A). A two-way 
rANOVA (Laterality x ROIs) revealed significant main effects for laterality and 
ROIs [Laterality: F(1, 71) = 7.37, p = .0083, n?; = .0081; ROIs: F(1.76, 124.97) = 24.77, 
p < .0001, n?; = .084] (Table 6). Additionally, the interaction between laterality and 
ROIs was also significant [Interaction: F(2.42, 172.15) = 6.15, p = .0013, n?; = .0025]. 
Multiple comparisons among the ROIs showed significantly lower amplitudes for 
the T and PO regions than for other regions [T vs. aF: t(71) = 7.16, corrected p < 
.0001, Cohen’s d = 0.81; T vs. pF: t(71) = 848, corrected p « .0001, Cohen's d = 0.70; 
T vs. P: t(71) = 4.09, corrected p = .0005, d = 0.34; PO vs. aF: t(71) = 5.06, corrected 
p < .0001, Cohen's d = 0.72; PO vs. pF: t(71) = 5.94, corrected p < .0001, Cohen's d = 
0.65; PO vs. P: t(71) = 5.99, corrected p < .0001, Cohen’s d = 0.37]. However, post-hoc 
analysis revealed no significant difference between the mean amplitudes of T and 
PO [t(71) = 1.19, corrected p = .48, Cohen’s d = 0.10]. Comparing the ERPs of these 
regions, a clear sharp peak was found in the PO region of the left hemisphere 
(Figure 6B). We found significant laterality effects on the N170 in the T and PO 
regions [T: t(71) = 2.56, corrected p = .013, Cohen's d = 0.30; PO: t(71) = 3.32, cor- 
rected p = .0028, Cohen’s d = 0.39]. 

A follow-up rANOVA (Prime Type x Suffix) was performed for PO in the left 
hemisphere. No significant main effects of prime type and suffix or interaction 
were found [Prime Type: F(1, 17) = 0.038, p = .85, n2ç = .0006; Suffix: F(1, 17) = 
0.011, p = .92, n’¢ = .0002; Interaction: F(1, 17) = 0.079, p = 78, n’¢ .0011] (Table 6). 
These results indicated that the target -sa and -mi nouns were processed similarly 
in this time window, even if the nouns were preceded by morphologically related 
primes. Further, the linear mixed-effects model analyses of the left PO amplitude 
found no significant temporal cluster regardless of TP or frequency (corrected p > 
.05), indicating that TP and frequency did not modulate the amplitude of the N170. 
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Figure 5: Effect of the transition probability between stem and suffix in the N400 region. Related 
condition (A) and unrelated condition (B). The rectangle areas indicate significant temporal 
clusters (corrected p « 0.05). C, central; LT, left temporal; RT, right temporal; PO, parieto-occipital. 
We performed a multiple comparison correction procedure on the continuous moving average 
amplitude of the point-by-point regressions using a permutation test. *corrected p « .05. 
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Figure 6: Significant laterality effects on the N170 in the temporal (T) and parieto-occipital (PO) regions. 
(A) Averaged N170 (150-200 ms) scalp distributions of the four conditions (Prime Types x Suffix). 

The N170 was more negative-going in the left T and PO regions. (B) Event-related potentials (ERPs) 

of the T and PO regions. Solid and dashed lines represent ERPs of the left and right hemispheres, 
respectively. The rectangle areas indicate the time window of the N170 (150-200 ms). We applied the 
Greenhouse-Geisser correction for rANOVA and Shaffer's modified sequentially rejective Bonferroni 
procedure for the post-hoc tests. *corrected p « .05, **corrected p « .01. 


Table 6: ANOVA results of the N170. 


Factors df F p He 
Laterality x ROIs 

Laterality 1,71 7.37 .0083 .0081 
ROIs 1.76,124.97 24.77 «.0001 .084 


Interaction 2.42,172.15 — 6.15 .0013 — .0025 
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Table 6 (continued) 


Factors df F p He 
PO: Prime Type x Suffix 

Prime Type 1,17 0.038 .85 .0006 
Suffix 1,17 0.011 92 .0002 


Interaction du 0.079 78 .0011 


df, degrees of freedom; n?;, generalized eta squared. 


4 Discussion 


Inthe present ERP study using the masked priming paradigm (Figure 1), we obtained 
three striking results by examining the N400 and N170 components of two types 
of Japanese-derived nouns—semantically transparent -sa nouns and lexicalized -mi 
nouns (Table 7). First, we demonstrated the significant priming effects on the N400, 
as well as the significant laterality effects on the N170, for -sa and -mi nouns (Figures 
3, 4, and 6; Tables 5 and 6). These results suggest that morphologically complex words 
were processed by a common neural mechanism (i.e. similarity). Moreover we 
found a larger N400 and lower behavioral performance (lower accuracy and longer 
RTs) for -mi nouns given their lexicalized meaning Oe, dissimilarity). Second, by 
using the linear mixed-effects models, we show that the TP from stem to suffix and 
the amplitude of the N400 showed significant negative modulations in the LT, C, RT, 
and PO (corrected p < .05) (Figure 5). These results indicate that the TP attenuates the 
amplitude of the N400. Third, both types of derived nouns showed significant mor- 
phological priming effects in the behavioral data; that is, shorter RTs and lower error 
rates under the related condition (Tables 2, 3, and 4). These results support similar 
neural mechanisms for processing the two derived noun types. However, behavio- 
ral data also indicate that -mi nouns were more demanding than -sa nouns, further 
suggesting the dissimilarity of processing mechanisms for these de-adjectival nouns. 

In the rANOVAs of the N400 for the factor of prime type, we found a signifi- 
cantly lower mean amplitude for the unrelated condition in the ROIs of C, LT, RT, 
and PO (Figure 3A and 3C, Table 5). These results show that the N400 was attenu- 
ated under the related condition, consistent with a previous masked priming ERP 
study (Lavric, Clapp, and Rastle 2007). As the masked prime activates the lexical 
entry of morphemes, these results also suggest the morphological relation between 
a prime adjective and a target -sa/-mi noun Oe, both target -sa and -mi nouns 
share a common adjectival stem with the prime adjective). As was reported in 
prior ERP studies (Lavric, Clapp, and Rastle 2007; Morris and Stockall 2012), the 
results suggest that the common morpheme (i.e., stem) in the prime adjective and 


Chapter 8 (Dis)similarities between semantically transparent and lexicalized suffixation === 155 


Table 7: Summary of similarities and dissimilarities between -sa 
and -mi nouns. 


Similarities Dissimilarities 
Accuracy = — -sa > -mi 
RTs Priming effects: Related < Unrelated -sa < -mi 
N400 Priming effects: Related < Unrelated -sa < -mi 
N170 Laterality effects: Left > Right — 


Behavioral data showed lower behavioral performance for -mi nouns 
than -sa nouns. We found significant priming effects (attenuation of 
the N400) for both de-adjectival nouns, while the N400 was larger 
for -mi nouns. The N170 showed significant laterality effects for -sa 
and -mi nouns. 


the target de-adjectival noun was activated in the parieto-occipital area. Notably, 
the non-mi type stem showed similar attenuation of the N400 for related condi- 
tions (Figure 4). Intriguingly, we found a larger N400 for -mi nouns in the C and LT 
regions (Figure 3C), indicating that the lexicalized meaning of -mi nouns elicited 
higher loads for semantic processing, consistent with previous studies (Hagiwara et 
al. 1999; Clahsen and Ikemoto 2012). These results demonstrate the similarities and 
dissimilarities between -sa and -mi noun processing mechanisms. 

In the mixed-effects model analyses of the N400, we found two significant tem- 
poral clusters in which the TP negatively modulated the N400 under the unrelated 
condition (Figure 5). However, the surface frequency did not have such a modula- 
tory effect. This result implies that both types of derived nouns are dealt with as 
morphemes (stems and suffixes) but not whole word forms. Further, the lack of 
temporal clusters in the related condition implies that the attenuated peak of the 
N400 in Figure 3A reflected the lower processing loads of morphological decompo- 
sition of the de-adjectival nouns. A previous study of the NA00 component evoked 
by word processing suggests that there are some subcomponents in the N400 time 
window (Pylkkänen and Marantz 2003). Moreover, MEG studies that address the 
M350 demonstrate that the M350 reflects two stages of the lexical process: lexical 
access and recombination of morphemes (Fruchter and Marantz 2015; Neophytou 
et al. 2018; Stockall et al. 2019). Therefore, the two temporal clusters in this study 
may correspond to two different stages of the lexical process. Importantly, linear 
mixed-effects modeling showed a negative beta value for the TP for each tempo- 
ral cluster. That is, as a larger N400 peak indicates a higher processing load for 
the target, it demonstrates that when the target TP is smaller the following suffix 
becomes more challenging for readers to predict. The results show that derived 
nouns are represented as stems and suffixes, even for the -mi nouns, which contra- 
dicts Hagiwara et al. (1999) who postulated no decomposition. 
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In the behavioral data, we found robust priming effects in both types of nouns 
(Tables 2, 3, and 4), which replicates the findings of Clahsen and Ikemoto (2012). 
As semantic priming does not occur in the masked priming paradigm, the priming 
effect can reflect the morpho-orthographical relation between prime and target 
(Frost, Foster, and Deutsch 1997; Rastle et al. 2000). Intriguingly, the -mi nouns 
showed longer RTs and larger N400 (Figure 3C), which reflects lexical access and 
morphological recombination, relative to the -sa nouns. The results demonstrate 
that a difference in semantic transparency can be observed at the behavioral and 
later neural level (i.e., >300 ms: the N400 time window in our study), not at the 
earlier neural level (i.e., «200 ms: the N170 time window in our study). This inter- 
pretation can also explain the behavioral dissociation of -sa and -mi nouns for 
agrammatic aphasic patients (Hagiwara et al. 1999). 

For the N170 component, which reflects morpho-orthographical decomposi- 
tion, the rANOVA of ERPs shows a sharp and significantly lower peak in the left PO 
region (Figure 6B). This result suggests that both noun types were decomposed into 
stems and suffixes. Further, there is no significant difference between -sa and -mi 
nouns or between related and unrelated conditions (Figure 6A and Table 6). The 
results further suggest that the target nouns are processed in the same manner, 
regardless of the noun or prime type. 

The study demonstrated results contrary to prior studies using cross-modal 
priming in English and aphasic studies, supporting the dual mechanism theory 
(Marslen-Wilson 1993; Hagiwara et al. 1999). The inconsistent results may stem 
from differences in the experimental paradigm. Prior studies employed auditory 
and visual stimuli or required production and comprehension, while this study 
employed visual stimuli and required comprehension alone (Figure 1). Moreover, 
we recruited a relatively small number of participants (n = 18) who could explain 
the lack of priming effect for the N170 (Figure 6A and Table 6), as reported in the 
previous MEG study (Fruchter, Stockall, and Marantz 2013). Future studies can con- 
sider recruiting more participants. 

An essential topic in psycho-neurolinguistics is whether morphological pro- 
cessing (e.g., morphological decomposition, lexical access, and recombination) 
is universal among languages. As most of the prior studies targeted English, it is 
important to probe morphological processing in typologically different languages, 
including Japanese. For example, the previous MEG study examined morphological 
processing in Japanese verbs and found that morphologically complex causative 
verbs elicited the N170, suggesting morphological decomposition (Ohta, Oseki, and 
Marantz 2019). Recent MEG studies also reported similar results in different lan- 
guages (Finnish: Hakala et al. 2018; Greek: Neophytou et al. 2018; Hebrew: Kastner, 
Pylkkànen, and Marantz 2018; and Tagalog: Wray, Stockall, and Marantz 2022). As 
with such studies, the ERP study helps elucidate the neural basis of morphological 
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processing (Table 7). As noted by Embick, Creemers, and Goodwin Davies (2021: 95), 
whether words are decomposed is too coarse-grained to guide the development of 
competing morphological theories. Future studies must probe more fine-grained 
research questions to contribute insight into the neural basis of morphological pro- 
cessing (Embick and Poeppel 2015). 

Whether morphology differs from syntax is another critical issue in theoret- 
ical and experimental linguistics. A recent MEG study reported a common neural 
activation for building compounds and phrases in the left temporal lobe (Flick et al. 
2018; see also Gwilliams 2020), although whether the left frontal language area acti- 
vates during the morphological processing remained unclear. Prior fMRI studies 
show that the left inferior frontal gyrus is a core region for syntactic computation 
(Ohta, Fukui, and Sakai 2013a, 2013b; Ohta, Koizumi, and Sakai 2017; Tanaka et al. 
2017; Tanaka et al. 2019), suggesting that this region is essential to morphological 
processing. Further research must examine the function of the left inferior frontal 
gyrus in morphological processing. 

This study aimed to bridge the gap in the neurophysiological evidence for pro- 
cessing morphologically complex words. We hypothesize a common mechanism for 
two types of Japanese de-adjectival nouns and show this in an ERP study employ- 
ing a masked priming paradigm. In the ERP analyses, the ERP components—N170 
and N400—show that the neural processes between the two types of nouns are 
driven by a common mechanism (Figures 3, 5, and 6; Tables 5 and 6). It accords 
with studies of complex word processing in English. Moreover, we also found a 
larger N400 for -mi nouns, given their lexicalized meaning (Figure 3C), suggesting 
the dissimilarity of the neural mechanism for processing two types of de-adjectival 
nouns. In particular, we elucidate where and when the variables of nouns modulate 
each ERP component by applying mixed-effects models (Figure 5). The modeling 
shows two separate stages of the lexical process, which can be explained by the TP 
between morphemes. In the behavioral analyses, we find a robust priming effect 
for the two types of nouns (Tables 2, 3, and 4), which also supports our hypothe- 
sis. Although further evidence is required to establish a more detailed mechanism 
for word recognition, the neural and behavioral results clearly showed a common 
processing mechanism of the two types of derived nouns in the earlier stage (i.e. 
morphological decomposition) (Table 7). In the later stage, which reflects lexical 
access and recombination of morphemes, different neural mechanisms processed 
the two types of de-adjectival nouns (Table 7). The results show the similarities and 
dissimilarities of the neural mechanisms, which were also proposed in the past- 
tense debate in English, for processing two types of de-adjectival nouns in Japanese. 
By demonstrating that the Japanese word-processing mechanism accords with that 
of typologically different languages, this study contributes to elucidating the cross- 
linguistic universality of the neural basis of morphological processing. 
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Appendix. The list of the adjectives used 


in this study 


-mi type stem 


Non-mi type stem 


Non-mi type stem 


(simple stem) (complex stem) 
aka-i (JR V», red) atarasi-i Gr L vx, new) aoziro-i ($ All», pale) 
akaru-i (HH & v», bright) haya-i (4 v», early) araarasi-i (fi < L ux, violent) 
ama-i (H v», sweet) hido-i (IKE v», horrible) bakabakasi-i (Ha FERS REL O, 


ridiculous) 


ao-i (Œ >, blue) 


hiku-i (v>, low) 


buatu-i (4) Fi. v», thick) 


atataka-i (OR 2» 4, warm) 


hiro-i (JZ, spacious) 


habahiro-i (SU us, broad) 


atu-i (Fi v>, thick) 


huru-i (1 v», old) 


hadazamu-i (MIZE v>, chilly) 


huka-i (Vš >, deep) 


kata-i ([5| v», hard) 


hodotoo-i (F3 V», far away) 


ita-i (Ji «^ , painful) 


kawai-i (2»45 V» v>, cute) 


hosonaga-i («lE v>, long and thin) 


kanasi-i (É L v^, sad) 


kowa-i (Hi v», scary) 


kandaka-i (Ff i v», high-pitched) 


kara-i (2> 5 V3, spicy) 


kura-i (Hi >, dark) 


kiiro-i (š €& C», yellow) 


kayu-i (FE v», itchy) 


mizika-i (Ej «^, short) 


kimazu-i (^ £ 3^ t», unpleasant) 


kurusi-i ( < 4 L v>, distressed) 


muzukasi-i (2 3^2» L v», 
difficult) 


kodaka-i (/] vic v>, slightly elevated) 


kusa-i (5 v», smelly) 


naga-i (š w>, long) 


kokoroboso-i (tfl v>, lonely) 


maru-i (žu v>, round) 


oisi-i (43 v: L Vs, tasty) 


monosugo-i (4) v>, tremendous) 


niga-i (Z 235, bitter) 


00-i (£ v>, many/much) 


musiatu-i (2& L =, hot and humid) 


okasi-i (38 2» L v3, funny) 


ooki-i (K & ux, large) 


nadaka-i (44 i= V», renowned) 


omo-i (Œ. v», heavy) 


oso-i (JE V», late/slow) 


namanamasi-i (Œ. 4 L v», graphic/ 
vivid) 


omosiro-i (8 t LAL, 
interesting) 


sukuna-i (b Z vs, few/little) 


nezuyo-i (#8584, deep-rooted) 


sibu-i (8&4, astringent) 


tiisa-i (J) & v», small) 


okubuka-i (VR C», profound) 


sitasi-i (#1 L v^, friendly) too-i (3 V^, distant) sikaku-i (VU f$ v», square) 
sugo-i (B v», fantastic) tura-i (2 & v5, stressful) subaya-i (R>, quick) 
taka-i (>, high) uresi-i (i L vx, happy) teatu-i (FJZ v», hospitable) 


toro-i (& ^ t», stupid/slow) 


usu-i (34 v», thin) 


tebaya-i (F5 v», speedy) 


tuyo-i (3 4», strong) 


waka-i (i v», young) 


tyairo-i (Z& & v», brown) 


uma-i (& v», tasty) 


waru-i (¿p Š v», bad) 


usugura-i (78H v», dim) 


yowa-i (55 v» , weak) 


yasu-i (224%, cheap) 


yowayowasi-i (58 * L vs, faint) 
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Chapter 9 

Brain mechanisms for the processing 
of Japanese subject-marking particles 
wa, ga, and no 


1 Introduction 


Advances in neuroimaging techniques, such as functional magnetic resonance 
imaging (fMRI), have incited neurolinguists to identify the brain regions respon- 
sible for syntactic structure-building during online sentence processing. The 
literature has shown that the left inferior frontal gyrus (IFG) and the posterior 
part of the left temporal cortex, traditionally called Broca's and Wernicke's areas, 
are responsible for several aspects of language processing (Price 2012). Recent 
meta-analyses of fMRI studies suggest that the IFG and the posterior temporal 
cortex in the left hemisphere are crucial for syntactic processing (Heard and Lee 
2020; Rodd et al. 2015; Zaccarella, Schell, and Friederici 2017). Moreover, some 
studies have highlighted functional subregions involved in different aspects of 
syntactic processing such as syntactic reanalysis (Hirotani et al. 2011) and online 
structure-building (Zaccarella, Schell, and Friederici 2017). However, current 
fMRI evidence may be biased toward well-studied Western languages such as 
English or German (see the cited meta-analysis studies above). Though Japanese 
is a relatively well-studied Asian language, many aspects of its unique syntactic 
features remain unexplored. 

Consider, for example, how a sentence is syntactically parsed in real-time. One 
of the most fundamental aspects of structure-building is to establish the relations 
between arguments and predicates. Regarding online incremental processing, the 
verb plays a significant role in a Subject-Verb-Object language, such as English, 
because it is available at an early stage of incremental structure-building, deter- 
mining what comes in the rest of the sentence. Thus, the structure-building is verb- 
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driven, as the verb, along with its subject, plays a significant role in foreseeing 
incoming arguments and adjuncts (Gorrell 1995; Kamide, Altmann, and Haywood 
2003; Pritchett 1992). In contrast, in a Subject-Object-Verb (SOV) language, such as 
Japanese, all the arguments and adjuncts precede the verb, which makes the incre- 
mental structure-building largely done before the verb is encountered; thus, struc- 
ture-building is considered argument-driven (Kamide and Mitchell 1999; Kamide, 
Altmann, and Haywood 2003). Hence, particles attached to noun phrases (NPs) play 
significant roles in structure-building in that they can incrementally assign a proper 
syntactic structure to the inputs. However, only a few fMRI studies have probed the 
neural bases of the processing of Japanese particles (Hashimoto, Yokoyama, and 
Kawashima 2014; Inui, Ogawa, and Ohba 2007), and how different particles recruit 
brain regions related to syntactic processing has rarely been explored. 

Accordingly, this study focused on the processing of three variants of sub- 
ject-marking particles in Japanese, as they may drive distinct structure-building or 
syntactic reanalysis with a minimum impact on the semantic content of a sentence. 
As shown in (1), Japanese subjects can be marked by particles such as the topic 
particle wa, the nominative particle ga, and the genitive particle no.! On the incre- 
mental structure-building, this chapter assumed that the different subject-marking 
particles drive syntax-related activation differently. We conducted two fMRI exper- 
iments with pairs of wa- and ga-marked subjects and pairs of ga- and no-marked 
subjects, each for which we compared the activations resulting from different 
subject-marking particles. The next section briefly introduces some theoretical 
assumptions regarding the three subject-marking particles. 


(1) a. Subject with wa 


Taro-wa Mari-o tataita. 
Taro-TOP Mari-ACC Hit 
*Taro hit Mari." 

b. Subject with ga 
Taro-ga Mari-o tataita. 
Taro-NOM Mari-ACC Hit 
*Taro hit Mari." 


1 These particles are not exclusively used as a subject marker and can mark other elements in 
a sentence, such as an object. In addition, Japanese subjects can be marked with other particles, 
such as the dative particle ni: 


Taro-ni eigo-ga dekiru. 
Taro-DAT English-NOM can.do 
"Taro can speak English." 
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c. Subject with no 


Taro-no kaita Hon 
Taro-GEN wrote Book 
*The book that Taro wrote" 


1.1 Subjects with wa and ga 


Notably, wa can be used as a thematic or contrastive topic marker (Kuno 1973). In 
this chapter, we focused on the non-contrastive use of wa, as in (1a), widely recog- 
nized as a topic marker. In the literature on generative syntax, there is a consensus 
that the syntactic position of the wa-marked subject as a thematic topic is higher 
than that of the ga-marked subject, adding a structural layer. The clausal structure 
shown in (2) has been proposed from the perspective of the cartographic approach 
to the left-periphery (Rizzi 1997), and the wa-marked subject is often assumed to 
be in the Spec of TopP (Endo 2007; Miyagawa 2017; Nakamura 2020), while the 
ga-marked subject is in the Spec of TP: 


(2) [TopP [TP [vP [VP]]]] 


Evidently, only one psycholinguistic study examined the difference in processing 
costs between topicalized sentences Oe, wa-marked SOV: SyopOV) and non-topi- 
calized counterparts Oe, ga-marked SOV, SyoyOV). Imamura, Sato, and Koizumi 
(2016) examined whether sentences with a topicalized wa-marked subject were 
more cognitively demanding than those with a non-topicalized ga-marked 
subject. They used a semantic correctness decision task during which partici- 
pants were required to judge whether a presented sentence was semantically 
plausible; they found no significant differences in error rates and reaction times 
between those types of sentences. They interpreted the results to reflect a trade- 
off between a lower frequency of Syo,OV than that of StopOV and a more complex 
syntactic hierarchy of StopOV than that of SyoyOV. To date, no study has pro- 
vided direct evidence of the increased syntactic cost of SropOV relative to that of 
SnomOV. This study examined whether wa-marked subjects, relative to ga-marked 
ones, induced increased activation in brain regions related to syntactic struc- 
ture-building. 
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1.2 Subjects with ga and no 


Syntactic properties of subjects with ga and no have been extensively studied in the 
literature on theoretical syntax, wherein various syntactic conditions have been 
identified for no-marked subjects but not for ga-marked subjects (e.g. Harada 1971; 
Hiraiwa 2005; Miyagawa 2011; Watanabe 1996; see also, Maki and Uchibori 2008 
and Ochi 2017 for an overview).? For example, while no-marked subjects typically 
appear in relative clauses, as in (1c), they cannot appear in matrix and some subor- 
dinate clauses, as in (3). 


(3 a. Subjectin a main clause 


Taro-ga/*no Mari-o tataita. 
Taro-NOM/GEN Mari-ACC Hit 
*Taro hit Mari." 
b. Subjectin a subordinate clause with complementizer to 
Taro-ga/*no Mari-o tataita-to kiita. 


Taro-NOM/GEN Mari-ACC _ hit-COMP heard 
“(D heard that Taro hit Mari.” 


Among various syntactic restrictions of no-marked subjects discussed in the litera- 
ture, the present chapter focused on the adjacency constraint, whereby the accepta- 
bility of no-marked subjects is said to be degraded by the presence of intervening 
elements between the subject and the verb (Harada 1971), as shown in (4). 


(4 a. Subject in the adjacent condition 
Kyoo juku-de kodomotachi-ga/no naratta — rekishi-wa 
today cram.schoolat children-NOM/GEN studied ` history-TOP 
“The history that the children studied at a cram school today . . ." 
b. Subjectin the non-adjacent condition 

Kodomotachi-ga/no kyoo  juku-de naratta — rekishi-wa 
children-NOM/GEN today cram.school-at studied history-TOP 
“The history that the children studied at a cram school today . . ." 


2 In generative syntax studies, there is a dispute as to whether ga- and no-marked subjects share 
the same syntactic structure; this chapter probed the difference in frequency and the processing 
cost of reanalysis on the no-marked NP sentence, leaving it to future research to resolve the theo- 
retical dispute. 
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From an empirical perspective, Nambu and Nakatani (2014) conducted an accepta- 
bility judgment experiment and confirmed that the acceptability of no-marked 
subjects was degraded in the non-adjacent condition, such as (4b), while ga-marked 
subjects were not sensitive to the adjacency factor. Irrespective of such a syntac- 
tic constraint, they revealed the main effect of the case-marking factor such that 
the acceptability of no-marked subjects was lower than that of ga-marked subjects. 
Moreover, Nambu (2013a) analyzed frequencies of occurrence of ga- and no-marked 
subjects in relevant linguistic environments in corpus data and identified the scar- 
city of no-marked subjects and the effect of the adjacency constraint (adjacent ga, 
88.9%; non-adjacent ga, 99.8%; adjacent no, 11.1%; non-adjacent no, 0.2% in the 
Corpus of Spontaneous Japanese Speech). 

The predominant use of ga as a subject marker in the relevant environments 
has stemmed from language change (Frellesvig 2010; Nambu 2019); further, the 
primary function of no is adnominal, typically possessive. From the perspective 
of comprehension, Nambu (2013b) conducted a sentence completion experiment, 
where participants were provided a no-marked NP and asked to complete the sen- 
tence. They found that the adnominal interpretation was dominant over the subject 
interpretation (adnominal interpretation, 83.2%; subject interpretation, 16.8%). 

These empirical findings suggest that no-marked subjects, which are far less 
frequent than ga-marked ones, likely induce the adnominal interpretation and, 
thus, require a syntactic reanalysis during online comprehension. However, to the 
best of our knowledge, no previous studies have investigated whether brain regions 
associated with syntactic reanalysis were activated when reading sentences with 
no-marked subjects. 


1.3 Aims of the study 


We conducted two fMRI experiments to test two hypotheses separately. Experiment 
1tested the hypothesis that a wa-marked subject is located at a higher syntactic posi- 
tion than a ga-marked subject, thus increasing the neural costs for structure-build- 
ing. Experiment 2 tested the hypothesis that a no-marked subject activates brain 
regions associated with syntactic reanalysis because a genitive subject is far less 
frequent than a nominative subject. Moreover, Experiment 2 examined whether 
reanalysis-related brain activation further increased when a genitive subject was 
non-adjacent to the verb. 
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2 Experiment 1: wa- vs. ga-marked subjects 


The first experiment aimed to test the hypothesis that wa constructions introduce 
an additional syntactic hierarchy, increasing the neural cost relative to ga con- 
structions. Previously, we demonstrated that the left pars opercularis (PO) in the 
IFG and the left posterior middle temporal gyrus (pMTG) were responsive to the 
level of syntactic hierarchy but not to the processing cost associated with the linear 
distance between the filler and the gap, comparing OSV- and SOV-order sentences 
(Iwabuchi, Nakajima, and Makuuchi 2019). Accordingly we hypothesized that 
topicalized sentences induce higher activity in these regions than non-topicalized 
sentences. We performed hypothesis-driven volume of interest (VOI) analyses to 
test this hypothesis. Additionally, we conducted a whole-brain analysis to confirm 
that VOIs did not overlap with a larger activated cluster mainly covering other 
functional areas. 


2.1 Methods 


This chapter employed some of the data from our previous study; the details of 
the participants, the experimental stimuli, and the procedure have been reported 
(Iwabuchi, Nakajima, and Makuuchi 2019). 


2.1.1 Participants 


Twenty-two participants (mean 24.7 years old, 19-35 years old; nine male and 13 
female) were recruited; three were excluded from the following analyses given the 
low probe-matching task accuracy (see Iwabuchi, Nakajima, and Makuuchi 2019 for 
details). All participants were right-handed (Flanders handedness questionnaire, 
score range 90-100; Nicholls et al. 2013) native Japanese speakers, had normal or 
corrected-to-normal vision, and had no history of neuropsychiatric disorders. 


2.1.2 Stimuli 


In our previous study, we initially created 30 SOV sentences with a heavy subject 
(hS; eg, Takabisya-na seikaku-no sakka-ga gaka-o nagutta. [“The writer with an 
aloof character punched the painter."]) and 30 SOV sentences with a heavy object 
(hO; e.g., Sakka-ga takabisya-na seikaku-no gaka-o nagutta. [“The writer punched 
the painter with an aloof character.”]) (see Iwabuchi, Nakajima, and Makuuchi 2019 
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for the complete list of the sentences). We then created the sentences with a topi- 
calized subject, replacing the nominative marker ga with the topic marker wa in 
the SOV sentences. This study used these topicalized sentences as target stimuli to 
contrast wa-marked constructions with ga-marked ones. A 2 x 2 factorial design 
was employed, including the TOPICALIZATION (non-topicalized SOV, topicalized 
SOV [SzopOV]) and HEAVINESS (hO, hS) factors. Sentences (5a-d) represent examples 
of hO-SOV, hS-SOV, hO-S;opOV, and hS-SzopOV, respectively. We included the factor 
of HEAVINESS to confirm that the effect of TOPICALIZATION was not affected by 
the linear distance between nominal elements in a sentence. We also examined 
the effect of HEAVINESS on behaviors and brain activity, although this was not our 
primary area of interest. 


(5) a. Sakka-ga takabisya-na seikaku-no gaka-o nagutta 
[writer-NOM] [an aloof character-with painter-ACC] punched 
“The writer punched the painter with an aloof character.” 

b. Takabisya-na seikaku-no sakka-ga gaka-o nagutta 
[an aloof character-with writer-NOM] [painter -ACC] punched 
“The writer with an aloof character punched the painter." 

c. Sakka-wa takabisya-na seikaku-no gaka-o nagutta 
[writer-TOP] [an aloof character-with painter-ACC] punched 
“The writer punched the painter with an aloof character.” 

d. Takabisya-na seikaku-no sakka-wa ` gaka-o nagutta 
[an aloof character-with writer-TOP] [painter ACC! punched 
“The writer with an aloof character punched the painter.” 


2.1.3 Procedure 


The sentence stimuli were presented with the rapid serial visual presentation par- 
adigm; the duration of each phrase was set at 700 ms. The Presentation software 
(Neurobehavioral Systems, Inc., Albany, CA, USA) was used to control the stimu- 
lus presentation. For each participant, the stimulus presentation order was pseu- 
dorandomized. After completing practice sessions (see Iwabuchi, Nakajima, and 
Makuuchi 2019 for details), participants underwent three fMRI runs. In each run, 10 
sentence stimuli were presented for each condition (i.e., hO-SOV, hS-SOV, hO-S7opOV, 
hS-SzopOV). In 40% of the trials, participants performed a probe-matching task 
where they were required to judge the semantic matching between a target sen- 
tence (e.g., Takabisya-na seikaku-no sakka-ga gaka-o nagutta. [“The writer with an 
aloof character punched the painter.”]) and a probe sentence (e.g., Sakka-ga gaka-o 
nagutta. [“The writer punched the painter.”]). In the probe-matching task, semanti- 
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cally correct probes were presented on half of the trials, whereas, on the other half, 
incorrect probes were presented. We have described elsewhere how those probes 
were created (Iwabuchi, Nakajima, and Makuuchi 2019). 


2.1.4 Functional magnetic resonance imaging data acquisition 


A 3-T MRI scanner (MAGNETOM Skyra; Siemens, Erlangen, Germany) was used 
for MRI data collection. We acquired functional scans using the following param- 
eters: repetition time (TR) = 2,000 ms, echo time (TE) = 30 ms, matrix 64 x 64, flip 
angle = 90 degrees, field of view = 192 x 192 mm, 35 axial slices, and slice thick- 
ness - 3 mm with 1 mm gap. We acquired 485 volumes during each fMRI run. For 
anatomical reference, a T1-weighted image was obtained from each participant: 
MPRAGE sequence, TR = 2,300 ms, TE = 2.98, inversion time = 900 ms, field of 
view = 256 x 256 mm, matrix 256 x 256, 224 sagittal slices, 1 mm isotropic voxel, 
flip angle - 9 degrees. 


2.1.5 Behavioral data analysis 


The accuracy and mean reaction time in the probe-matching task were measured 
and subjected to a two-way analysis of variance (ANOVA) with two within-subjects 
factors: TOPICALIZATION (topicalized, non-topicalized) and HEAVINESS (hO, hs). 


2.1.6 Whole-brain functional magnetic resonance imaging analysis 


Using the SPM12 software; we initially conducted a whole-brain analysis for the 
same preprocessed functional dataset used in Iwabuchi, Nakajima, and Makuuchi 
(2019). After the condition effects were estimated using a conventional general 
linear model approach (see Iwabuchi, Nakajima, and Makuuchi 2019 for the 
detailed descriptions), the individual beta maps ofthe four experimental conditions 
(i.e., hO-SOV, hS-SOV, hO-SropOV, and hS-S;ppOV) were submitted to a full factorial 
ANOVA with two within-subjects factors of TOPICALIZATION and HEAVINESS. The 
main TOPICALIZATION effect was identified using a paired t-test to compare the 
brain activation for topicalized and non-topicalized sentences. The statistical maps 
were initially thresholded at p « 0.001 (uncorrected at the peak level). Statistical 


3 Available at http://www.fil.ion.ucl.ac.uk/spm/ 
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inferences were based on a threshold of p « 0.05 with family-wise error correction 
at the cluster level. 


2.1.7 Volume-of-interest-based functional magnetic resonance 
imaging analysis 


We performed an a priori VOI analysis for these regions in addition to the whole- 
brain analysis. Definitions of the VOIs in the left PO and left pMTG were provided 
by Iwabuchi, Nakajima, and Makuuchi (2019). From these predefined VOIs, we 
extracted the percent signal change during the sentence presentation period for 
each experimental condition and averaged it across three fMRI runs. We used the 
Marsbar toolbox (Brett et al. 2002) to obtain the percent signal changes. For each 
of the PO and pMTG VOIs, we submitted the calculated percent signal changes to 
a two-way ANOVA with the factors of TOPICALIZATION and HEAVINESS. We also 
performed a two-way ANOVA of the percent signal change data for each fMRI run. 
As described below, the experimental effect on brain activity may decrease as the 
fMRI runs progress (see Section 3.2). We conducted a post-hoc run-by-run analysis 
to examine whether such a trend was observed for the different types of experi- 
mental stimuli used in Experiment 1. 


2.2 Results 
2.2.1 Behavioral results 


Mean reaction time and standard deviation (SD) were as follows: hO-SOV, 1547+50 
ms; hS-SOV, 1570+56 ms; hO-SyopOV, 1622+64 ms; hS-SropOV, 1581+64 ms. We found 
a significant main effect of TOPICALIZATION on reaction time (F(1,18) = 4.83, p = 
0.04). The other effects or interactions were not statistically significant (HEAVI- 
NESS, F[1,18] = 0.006, p = 0.94; interaction, F[1,18] = 0.68, p = 0.42). The accuracy in 
each condition (mean+SD) was as follows: hO-SOV, 89.442.090; hS-SOV, 92.91.5906; 
hO-SyopOV, 91.6+2.2%; hS-SropOV, 88.722.090. Regarding accuracy, no significant 
effect or interaction was observed (TOPICALIZATION, F[1,18] = 1.25, p = 0.28; HEAV- 
INESS, F[1,18] = 0.52, p = 0.48; interaction, F[1,18] = 1.87, p = 0.19). 
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2.2.2 Functional magnetic resonance imaging results: Whole-brain analysis 


The contrast of topicalized versus non-topicalized sentences revealed a large signif- 
icant cluster from the supramarginal gyrus to the pMTG via the posterior superior 
temporal gyrus (pSTG) (peak coordinate, [-54 -55 26]; t-value, 4.49; cluster size, 46; 
Figure 1). We also identified a significant cluster in the precuneus (peak coordinate, 
[9, -64, 22]; t-value, 3.71; cluster size, 51). We found no significant activation for 
the reverse contrast (non-topicalized > topicalized). The effect of heaviness and the 
interaction also revealed no activation. 


x = -54 


^ 


Figure 1: Result of the whole-brain analysis in Experiment 1. The numbers denote the x coordinates 
ofthe sagittal slices. 


2.2.3 Functional magnetic resonance imaging results: Volume of interest 
analysis 


We found a significant main effect of TOPICALIZATION for the left PO (F[1,18] = 
8.71, p - 0.0085; the top left panel in Figure 2) and a marginally significant effect of 
it for the pMTG (F[1,18] = 4.00, p = 0.061; the bottom left panel in Figure 2). The main 
effect of HEAVINESS was also significant for the left PO (HEAVINESS, F[1,18] = 5.12, 
p = 0.036; interaction, F[1,18] = 2.32, p = 0.14), whereas neither the effect of HEAV- 
INESS nor an interaction was significant for the left pMTG (HEAVINESS, F[1,18] = 
0.004, p = 0.95; interaction, F[1,18] = 0.47, p = 0.50). 

As in Figure 2 (the right panels), a run-by-run VOI analysis revealed that 
the main effects of TOPICALIZATION were significant or marginally significant 
only during the third run (PO, F[1,18] = 16.92, p = 0.0007; pMTG, F[1,18] = 3.86, p = 
0.065) without any other reliable effects or interactions (p-values > 0.1). For both 
VOIs, we found no reliable effects or interactions during the first and second runs 
(p-values > 0.1). 
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2.3 Discussion 


The finding that the left PO and pMTG showed increased activation for Japanese 
topicalized sentences relative to non-topicalized sentences supported the hypothe- 
sis that topicalization introduces additional neural costs in syntax-related sites by 
adding a syntactic layer. The slower response times for topicalized sentences than 
for non-topicalized ones also indicated that topicalization imposes an additional 
syntactic cost, corroborating the findings of Imamura, Sato, and Koizumi (2016). 
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Figure 2: Results of the volume of interest analysis in Experiment 1. Left panels show the mean 
percent signal changes for each experimental condition in the left PO and pMTG. Right panels show 
the results from the run-by-run analysis. For the right panels, bars in each run (Runs 1, 2, and 3) 
represent hO-SOV, hS-SOV, hO-S;opOV, and hS-StopOV from left to right. Error bars denote standard 
errors. PO, pars opercularis. pMTG, posterior middle temporal gyrus. hO-SOV, non-topicalized 
sentences with a heavy object. hS-SOV, non-topicalized sentences with a heavy subject. hO-S;opOV, 
topicalized sentences with a heavy object. hS-S;opOV, topicalized sentences with a heavy subject. 
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A significant effect of HEAVINESS was found in the left PO, indicating that the 
activity there was higher for the hS sentences than for the hO sentences. In 
this study, stimuli with a heavy subject had a "long before short" construction, 
whereas sentences with a heavy object had a “short-before-long” construction, 
as shown in (5). As preferred constructions are generally easier to process and 
can induce less activation in relevant brain regions than non-preferred ones, 
the increased neural cost for the hS construction seems inconsistent with prior 
findings of the "long before short" preference in Japanese (Yamashita and Chang 
2001). Another possible explanation is that the effect of HEAVINESS in the left PO 
may reflect the relative frequencies of the hS and hO constructions. To examine 
the frequencies of SOV sentences with hS or hO, we conducted a preliminary 
corpus analysis using the Balanced Corpus of Contemporary Written Japanese, 
which contains approximately 100 million words. This analysis revealed that the 
number of Sq595-hO-V (61 tokens) was about six times that of hS;op-O-V (10 tokens). 
However, for sentences with a ga-marked subject, we found that the difference 
was relatively small between the number of S-hO-V constructions (25 tokens) and 
that of hS-O-V constructions (35 tokens). While the difference in the left PO activity 
was relatively large between S-hO-V and hS-O-V constructions with a ga-marked 
subject, it was small between Srop-hO-V and hSyop-O-V. This activation pattern is 
inconsistent with the frequency account of the HEAVINESS effect because the dif- 
ference between Syop-hO-V and hSrop-O-V should be larger if the left PO activity 
reflects the frequency. 

We must be cautious in generalizing these findings because this study had 
some limitations. Importantly, the literature has shown that predicate types or 
the presence of contextual information can influence the interpretation of wa 
and ga (Kuno 1973); therefore, the relative difficulties of wa- and ga-marked 
subjects may vary in different experimental settings. For example, in sentences 
such as *Taroo-wa yuushuu-da [Taroo is brilliant]? wa is more typically used, 
while a ga-marked subject is more typical in sentences such as *Taroo-ga yuush- 
00-sita [Taroo won the cup]." This study used the latter types of sentences, and we 
cannot exclude the possibility that the observed effect of topicalization depended 
on the use of specific predicate types. Moreover, the relative processing costs of 
wa and ga may also be affected by information structure; for example, Japanese 
speakers prefer a given-new order to the reverse order (Kuno 1978; Nakagawa 
2020). Future studies can disentangle the effects from the syntactic effect of 
topicalization. 
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3 Experiment 2 


The differential processing cost of ga and no was examined with sentences with ga/ 
no subjects, placed at an adjacent and a non-adjacent position relative to the verb 
in a 2 x 2 factorial design with factors PARTICLE (ga/no) and DISTANCE (adjacent 
[Adj]/non-adjacent [Non-adj]). 


3.1 Methods 
3.1.1 Participants 


Twenty-two native Japanese speakers participated in the fMRI experiment. Data 
from four subjects were discarded because of low performances (the average of 
four runs was less than 75%). Thus, data from 18 subjects (11 females, aged 19-34 
years, mean age 27.3 years) were analyzed. No participants had a history of neu- 
ropsychiatric disorders, and all had normal or corrected-to-normal vision. All were 
right-handed (mean score of Flanders handedness questionnaire 100; Nicholls et al. 
2013) and gave their written informed consent. 


3.1.2 Stimuli 


The target stimuli of the experiment comprised the adjacency (DISTANCE, Adj vs. 
Non-adj) and the case-marking (PARTICLE, nominative ga vs. genitive no) factors in 
a 2 x 2 factorial design, yielding four experimental conditions. We used 30 lexical 
sets to create 120 target stimuli across the four conditions. As in (6), the Non-adj 
conditions contained two interveners: a temporal adverb and a locative postposi- 
tional phrase. The Adj conditions were constructed by scrambling the interveners 
of the Non-adj conditions to the front of the sentences. 


(6) a. Nominative ga, Non-adj 
Titi-ga sengetu syuttyoosaki-de nonda  osake-wa 
father-NOM  lastmonth businesss.trip-at drank sake-TOP 
yasumono-datta-sooda. 
cheap.one-was-I.heard 
“I heard sake that (my) father drank on his business trip last month was 
cheap.” 
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b. Genitive no, Non-adj 
Titi-no sengetu syuttyoosaki-de nonda  osake-wa 
father-GEN ` Jost month businesss.trip-at drank sake-TOP 
yasumono-datta-sooda. 
cheap.one-was-I.heard 
“I heard sake that (my) father drank on his business trip last month was 


cheap." 
c. Nominative ga, Adj 
sengetu syuttyoosaki-de Titi-ga nonda osake-wa 


lastmonth  businesss.trip-at father-NOM drank sake-TOP 
yasumono-datta-sooda. 

cheap.one-was-Lheard 

“I heard sake that (my) father drank on his business trip last month was 


cheap." 
d. Genitive no, Adj 
sengetu syuttyoosaki-de  Titi-no nonda osake-wa 


lastmonth businesss.trip-at father-GEN drank sake-TOP 
yasumono-datta-sooda. 

cheap.one-was-I.heard 

“I heard sake that (my) father drank on his business trip last month was 
cheap." 


3.1.3 Procedure 


We used a rapid serial visual presentation paradigm that was almost identical to 
Experiment 1, but the duration of each frame was 600 ms, resulting in a sentence 
duration of 4100 ms. The probes, created similarly to Experiment 1, followed 25% 
of the trials, and participants could not predict when they appeared. The probes 
motivated participants to read and understand every sentence. 


3.1.4 Functional magnetic resonance imaging data acquisition 


We collected MRI data using the same scanner used in Experiment 1. The differ- 
ence was that we used a multiband-accelerated echo-planner imaging sequence in 
Experiment 2 (Moeller et al. 2010). The acquisition order was ascending, and the 
following parameters were used: TR = 1,000 ms, TE = 30 ms, flip angle = 68°, field of 
view = 192 x 192 mm, matrix 64 x 64, 30 axial slices, slice thickness = 3 mm with a 
1 mm gap, and multiband acceleration factor = 2. Four fMRI runs were conducted 
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with 720 volumes per run. T1-weighted high-resolution structural images were 
acquired with the same parameters as in Experiment 1. 


3.1.5 Whole-brain functional magnetic resonance imaging analysis 


We conducted preprocessing and whole-brain analyses using the same protocol as 
in Experiment 1, except that the factors of PARTICLE and DISTANCE were included 
in the full factorial ANOVA. 


3.1.6 Volume-of-interest-based functional magnetic resonance 
imaging analysis 


As in Experiment 1, we performed the VOI analyses using the Marsbar toolbox to 
detect the weak effect. We assumed that a no-marked NP was initially interpreted 
as a non-subject and reinterpreted as nominative when a verb appeared. Hirotani 
et al. (2011) investigated the neural correlates of thematic and syntactic reanalysis 
in Japanese passive and causative constructions and revealed the left pars triangu- 
laris (PT) in the IFG and the left pSTG for the loci of the effects. Using spherical VOIs 
with 6 mm diameter, we performed VOI analyses for each run using the relevant 
coordinates (Hirotani et al. 2011; Figure 3). 


Figure 3: The left PT and pSTG VOIs for Experiment 2. Left, the left PT spheric VOI centered at [-54 27 6]. 
Right, the left pSTG spheric volume of interest centered at [-42 -57 21] with a 6 mm diameter. PT, pars 
triangularis; pSTG, posterior superior temporal gyrus. 
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3.2 Results 
3.2.1 Behavioral results 


Because the probes followed only 2596 of the trials, we deemed the performance 
for each condition not reliable enough to make a statistical inference for the com- 
parison of conditions. We used accuracy rates when selecting the participants for 
the fMRI data analysis and excluded four whose overall accuracy rates were «75906. 


3.2.2 Functional magnetic resonance imaging results: Whole-brain analysis 


The whole-brain analyses did not reveal any statistically significant activation. 


3.2.3 Functional magnetic resonance imaging results: 
Volume of interest analysis 


For data averaged across all runs, a two-way within-subjects ANOVA revealed a sig- 
nificant main effect of DISTANCE in the PT (F[1,22] = 4.85, p = 0.038), but the effect 
of PARTICLE (F[1,22] = 0.05, p = 0.83) and an interaction (F[1,22] = 0.43, p = 0.51) 
were not significant (Figure 4). We found no significant effect or interaction in the 
pSTG (p-values > 0.61). We then conducted a run-by-run post-hoc analysis. For the 
first run, we found significant main effects of PARTICLE (F[1,22] = 8.77, p = 0.007) 
and DISTANCE (F[1,22] = 14.13, p = 0.001) without an interaction (F[1,22] = 0.17, p = 
0.68) in the PT, but no significant effects (p-values » 0.1) were detected except for the 
effect of DISTANCE in the pSTG (F[1,22] = 7.15, p = 0.01). For the succeeding runs, no 
reliable effect was revealed in either VOI (p-values > 0.09). 


3.3 Discussion 


Consistent with previous studies showing that no-marked subjects placed at a dis- 
tance from the verb are the most highly loaded, the no Non-adj condition showed 
the highest activity in the VOIs, though, statistically, the PARTICLE x DISTANCE 
interaction was not significant, and only the main effect of the PARTICLE was signif- 
icant exclusively in the first run. The brain activity reflected the relative frequency, 
which means that ga-marked subjects are produced far more often than no-marked 
ones, as revealed by the contrast “no > ga" only in the first run but not thereafter. 
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Although participants were aware that no-marked NPs were the subject, habitual 
adnominal reading competed with it in the first run because of its low frequency. 


4 General discussion 


We examined the neural costs for processing Japanese subject-marking particles 
wa, ga, and no. Experiment 1 demonstrated that the left PO and pMTG, both of 
which are responsive to syntactic hierarchy, showed increased activity for Japa- 
nese topicalization, supporting the hypothesis that the SropOV construction has 
a more complex syntactic structure than the SOV construction. In Experiment 2, 
the activities in the left PT and pSTG, involved in syntactic reanalysis, were sig- 
nificantly higher for the no condition than for the ga condition, but the effect was 
found only in the first {MRI run. 

Unlike Experiment 1, the run-by-run analysis in Experiment 2 revealed the 
effects of PARTICLE and DISTANCE only during the first run; the left PO and pMTG 
consistently showed higher activity for the topicalized sentences than for the 
non-topicalized sentences in Experiment 1. However, the differences did not reach 
statistical significance during the first and second runs, perhaps because of a lack 
of statistical power. Hence, future studies should investigate the cause of the chang- 
ing patterns of activity across sessions. 

This study provides the first evidence that distinct Japanese subject-mark- 
ing particles (i.e., wa, ga, and no) drive neural systems associated with syntactic 
processing differently. The findings endorse further applications of fMRI toward 
less-studied languages that have similar subject-marking particles such as Uyghur 
(Asarina 2011). 
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Chapter 10 

Pragmatic atypicality of individuals with 
autism spectrum disorder: Preliminary 
results of a production study of sentence- 
final particles in Japanese 


1 Introduction 


Autism-spectrum disorder (ASD) is a neurodevelopmental disorder characterized by 
difficulties in social communication and interaction, and the presence of restricted 
and repetitive patterns of behavior, interests, or activities (American Psychiatric 
Association, 2013). The formal language skills (e.g., lexicon, the morphosyntactic or 
semantic level) of people with ASD differ across individuals, but they generally have 
impairments in pragmatics (especially in the use of language in social communica- 
tion; American Psychiatric Association 2013; Asperger 1991; Figsti et al. 2011; Kanner 
1943; Landa 2000; Tager-Flusberg, Paul, and Lord 2005), such as conversational 
inference, indirect speech acts, deictic expressions, and irony (Dennis, Lazenby, and 
Lockyer 2001; Happe 1993, 1994; Kalandadze et al. 2018; Lee, Hobson, and Chiat 
1994; Loveland et al. 1988). As most studies on these issues examine English speak- 
ers and other European languages, it is necessary to examine the issues faced by 
non-European language speakers to understand the full picture of the pragmatic 
problems of ASD. 

The linguistic expression called sentence-final particles (SFPs) is a particularly 
pragmatic characteristic of East and Southeast Asian languages, including Japanese. 
SFPs are bound morphemes that occur at the end of a sentence (Cooke 1989; Kwok 
1984; Law 2002; Yamada 1908: 680—684). As they have no referential meanings, SFPs 
do not affect the truth condition of a sentence (Davis 2011; McCready 2005, 2009), 
instead expressing the speaker's attitudes, moods, or feelings (Cook 1988, 1992; 
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Endo 2012; Iwasaki and Yap 2015; Miyagawa 2022: 138-195). In Japanese, the SFPs 
-ne and -yo are those most frequently used in casual conversations (Maynard 1993: 
183-220, 1997). It is often suggested that -ne represents the speaker's (S) attitudes 
such as confirmation, agreement, or cooperation with the addressee (A), and -yo 
represents S' attitudes such as notification, explanation, emphasis, or insistence 
(e.g., Cook 1988; McCready and Davis 2020; Miyagawa 2022: 138-195; Uyeno 1971). 
Many Japanese linguistic scholars assume that typical usages of -ne and -yo 
depend on whether the speaker (S) and the addressee (A) know the propositional 
information of the utterance (eg, Kamio 1994, 1997; Maynard 1993: 183-220; 
Muraki and Koizumi 1989).^ 3 When A knows the proposition of the utterance 
(hereafter, the AK condition), Japanese speakers typically end the utterance with 
-ne, and when A does not know it (hereafter, S exclusively knows: SeK condition), 
Japanese speakers typically end the utterance with -yo. For example, in the scene 
in Figure 1(a), A knows the information that Eagles has now won the championship, 
given that A and S are watching the game together on TV. In this AK condition, SFP 
-ne can express S' attitude of confirmation or sharing the emotion with A. Using -yo 
in the AK condition is supposed to be atypical (although grammatical) because -yo 
implies that S is explaining the information to A. However, in a similar scene under 
the SeK condition (Figure 1b), -yo would be typical because A, who is cooking in 
the kitchen and, thus, is not watching the game, does not know the information. By 
attaching -yo at the end of the utterance, S can show a willingness to notify A. 
Some SFPs can appear in combination with others. The above -yo and -ne often 
co-occur in the order of —yone, but never in the order of -neyo (for an explanation, 
see Endo 2012, and Miyagawa 2022: 138-195 for the analysis of sentence structure 
of SFPs).* The combined -yone is more typically used in the AK rather than SeK 
condition (see Oshima 2014b for details of the contexts in which -yone are used), 
indicating S' attitudes of insistence, and, at the same time, seeking A's confirmation 


1 There are exceptions, given the ongoing debate over the functions and usage of SFPs (McCready 
2009; McCready and Davis 2020; Oshima 2014a, 2014b; Takubo and Kinsui 1997). Nevertheless, this 
study focuses on how speakers with ASD used -ne and -yo in instances where their typical develop- 
ment (TD) counterparts used them. We then focused on when TD individuals typically used -ne and 
-yo and prepared the task with reference to traditional theoretical classifications. 

2 It has been argued that -yo is also used in soliloquy situations (Hasegawa 2010). 

3 The typical usage of the SFP in this study is for speakers of the Tokyo dialect. They vary consid- 
erably per dialect (Konishi 2020). 

4 Some researchers argue that -yone is a single SFP rather than a combination of -yo and ne, based 
on the analysis of the relationship between intonation and discourse function (e.g., Oshima 2014b). 
However, given that this study does not analyze the prosodic effect, we followed the major assump- 
tion that -yone is morphologically a combination of -yo and -ne (Cook 1988; Endo 2012; McCready 
2009; Miyagawa 2022: 138-195; Takiura 2008; Takubo and Kinsui 1997). 
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or agreement (Cook 1988). Based on the assumption that -yone is a combination of 
-yo and -ne, this study considers a sentence with -yone as a sentence ending in -ne. 

Despite its brevity (one mora), the important role of the SFP in verbal commu- 
nication has been identified by various studies. SFPs are frequently used by Japa- 
nese speakers in casual conversations, approximately once every 2.5 phrase-final 
position (Maynard 1993: 183-220, 1997). However, they hardly appear in formal 
situations, such as court sessions or press reports (Cook 1988; Maynard 1993: 
183-220). Further, several experimental studies show that native Japanese speak- 
ers spontaneously perceive attitudes expressed via SFPs (Matsui, Yamamoto, and 
McCagg 2006; Matsui et al. 2009). Some studies also argue that SFPs function as 
turn-taking operations (Tanaka 2000), facilitate conversations (Kajikawa, Amano, 
and Kondo 2004), or control interpersonal relations (Takiura 2008). 


a) The addressee knows the proposition (AK). 


A-FUAK 4 BBL fa ` 
Eagles-ga ima  yuushoosita -ne 
Eagles-NOM ` now won the championship SFP 


"Eagles have now won the championship." j) 


4-TNAKR 5 EJE L Z= £ 
Eagles-ga ima yuushoosita -yo 
Eagles-NOM ` now won the championship SFP 
“Eagles have now won the championship.” 


A | "am 


— 


Figure 1: Examples of typical usages of Japanese common SFPs. (a) The typical condition for -ne use, 
where the addressee knows (AK) the proposition. (b) The typical condition for -yo use, where the 
speaker exclusively knows (SeK) the proposition. 

Note: NOM, nominal; SFP, sentence-final particle. 
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Prior studies suggest that individuals with ASD show atypicality in their use of SFPs, 
but there is not enough data to form a comprehensive picture ofthe situation. There 
have been clinical observations and reported data that native Japanese children 
with ASD seldom use SFPs, especially -ne and -yo, in casual conversations, unlike 
the frequent use by children with typical development (TD) (Satake and Kobayashi 
1987; Watamaki 1997). However, these studies only observed one or a few partici- 
pants. Other experimental studies show some aspects ofthe atypicality of SFP use in 
individuals with ASD. For example, native Cantonese speakers with ASD produced 
fewer variations in SFPs (Chan and To 2016) than those with TD and were insensi- 
tive to S' intention, as expressed by the given SFP (Li et al. 2013). Similarly, among 
native Japanese speakers, TD adults with high autistic traits (i.e., Autism-Spectrum 
Quotient scores, AQ scores) less flexibly understand S’ attitudes from SFPs than 
those with low AQ (Kiyama et al. 2018; Kiyama et al. 2020)? The atypicality in the 
SFP prosodic aspect of individuals with ASD has also been suggested. TD native Japa- 
nese speakers with higher AQ utter shorter SFPs than those with lower AQ (Kiyama, 
Song, and Nasukawa 2021). However, it remains unclear whether the alleged atyp- 
icality of SFP production is a characteristic of native Japanese speakers with ASD. 

This chapter compares tendencies in SFP choices of native Japanese-speaking 
adults with ASD with those of TD. As a preliminary study for a future large-scale 
investigation, we limited our focus to the most common SEPs: -ne and -yo. Here, 
we utilize an oral discourse completion task (ODCT), which presents participants 
with the contexts of the AK and SeK conditions and asks them to freely produce the 
sentence-final expressions that they find suitable in the given context. Neverthe- 
less, individuals with ASD may have difficulty grasping AK/SeK implicitly presented 
through written and drawn materials, as previous psychological studies report 
their atypicality in inferring others' mental states from linguistic and physical con- 
texts (Baron-Cohen 1995; Baron-Cohen, Leslie, and Frith 1985; Frith 2001; Senju et 
al. 2009). We explicitly present the knowledge of the hypothesized S and A in the 
ODCT and tested whether the participants correctly remembered this information 
to prevent possible undesirable failures. 

We hypothesize that the choice of the SFPs -ne and -yo changes per the factors 
of Context (AK/SeK condition) and Group (ASD/TD participants). For Context, the 
-ne (-yo) production should be more frequent under the AK (SeK) than Sek (AK) 
condition. For Group, adults with ASD should produce -ne and -yo less frequently 
than TD adults. On the Context-Group interaction, the frequencies of -ne and -yo in 


5 Autism-Spectrum Quotient (AQ) (Baron-Cohen et al. 2001) is a self-administered instrument for 
measuring the degree to which an adult with normal intelligence has the traits associated with 
ASD. AQ assumes the continuum from ASD to TD, with higher AQ scores indicating higher autistic 
traits. 
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the TD group should differ between the AK and SeK conditions, but this difference 
should not be observed for the ASD group. 


2 Methods 
2.1 Participants 


Eleven adults with ASD (eight males, 20—48 years old, mean - 33.82, SD - 8.13) and 
14 TD adults (eight males, 20-24 years old, mean = 21.86, SD = 1.35) participated in 
the ODCT. AU participants were native Japanese speakers. We recruited ASD partic- 
ipants from Hasegawa Hospital in Tokyo based on a diagnosis using the Diagnostic 
and Statistical Manual of Mental Disorders, Fifth Edition (American Psychiatric Asso- 
ciation 2013). The TD participants were (under)graduate students at Tohoku Uni- 
versity in Sendai City and reported no history of neuropsychiatric disorders. This 
study was approved by the ethics committees of the National Rehabilitation Center 
for Persons with Disabilities in Japan, Hasegawa Hospital, and Tohoku University in 
the Kawauchi South District. The present study was conducted in accordance with 
the Declaration of Helsinki, and all participants provided written informed consent. 


2.2 Stimuli 


Inthe ODCT, we created eight dialogs for each ofthe AK and SeK conditions (Table 1), 
which specified whether the hypothesized S and A knew the proposition. Partici- 
pants were asked to produce any sentence-final expressions they would use in the 
given context. Each trial had four phases: context phase, preceding utterance phase, 
target utterance phase to allow participants to attach any sentence-final expres- 
sion, and comprehension question phase. In the context phase, whether S and A in 
the target utterance knew the proposition or if only S knew it was described. Next, 
the utterance by the interlocutor was presented. In the target utterance phase that 
followed, the end of the sentence was left blank to be completed using a sentence-fi- 
nal expression. In the comprehension question phase, we asked about the context 
and knowledge status of S and A. 

Throughout the task, each participant was asked to imagine acting as the 
hypothesized S, interacting with the same hypothesized A named *Suzuki" (a 
common Japanese family name), who was a long-time acquaintance of the same 
age and sex. We designed the ODCT using the same A and did not include multiple 
persons, given the suggestion that individuals with ASD have limited imagination 
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(Baron-Cohen et al. 2001) and the cognitive load of imagining many hypothetical 
situations may be too high for them. Similarly, A's age and sex were fixed because 
individuals with ASD have atypicality in reflecting interpersonal relationships in 
linguistic expressions (Zalla et al. 2014), but Japanese speakers select SFP depend- 
ing on it (Kamio 1994). 


2.3 Procedure 


Participants were visually presented with the stimulus materials for all four phases 
of one trial, using a paper handout for ASD participants and a PC monitor via Zoom 
online meeting software (Zoom Video Communications, Inc., San Jose, CA) for TD 
participants. While they saw the stimuli, the experimenter read aloud the context 
and the preceding utterance phrases and asked them how they would utter the 
target utterance if they were in the situation. The participants were then instructed 
to read aloud the target utterance sentence by filling in the blanks using a sen- 
tence-final expression. They were given three oral comprehension questions to 
check whether the participant understood the context correctly. Participants were 
allowed to change their answers as many times as they wished during the trial but 
were not allowed to return to the previous trials. The researcher documented the 
final answers. The task was conducted face-to-face for ASD participants and online 
using Zoom for TD participants. 


2.4 Data Analysis 


We excluded trials in which participants made errors in any of the comprehension 
questions before statistical analysis, as we aimed to examine the production of SFP 
when the participants had correctly grasped the knowledge status of the S and A. 
We also excluded expressions other than the SFPs -ne and -yo from the following 
analysis. 

We constructed logistic mixed-effects models with maximal random effect 
structures (Barr et al. 2013) that had Context (AK/SeK), Group (ASD/TD), and their 
interaction as the fixed effects on the production of -ne and -yo, respectively. We 
included the by-subject (by-item) random slope for Context (Group) with the 
by-subject (by-item) random intercept. We compared the most complicated model 
with the simpler model to determine whether random slope parameters improved 
model fit using the Chi-square log-likelihood test. If the difference was significant, 
we adopted the model with the larger log-likelihood ratio; otherwise, we adopted 
the simpler one. If a model did not converge, we simplified the random effects 
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structure (Jaeger 2009). When the interaction was significant, we used Bonferroni 
correction to perform post-hoc comparisons of Group (ASD/TD) by Context (AK/ 
SeK). Statistical analysis was conducted using R version 3.6.1. (R Development Core 
Team 2018), using the Ime4 package (Bates et al. 2014) for model estimation and 
emmeans (Lenth et al. 2020) for post-hoc comparison. 


3 Results 


Table 2 summarizes the accuracy of the comprehension questions. The ASD and TD 
groups understood the contents of the stimuli well. Figure 2 shows the proportion 
of SFPs produced by the participants in both groups, indicating that native Japanese 
speakers generally prefer -yo in SeK conditions. Meanwhile, they select various 
SFPs in the AK condition. Moreover, individuals with TD used SFPs in almost all 
trials, while those with ASD produced utterances with no SFP more frequently than 
those with TD. In the following statistical analyses, we merged the production of 
-yone into -ne, as explained in the introduction. 


Table 2: Mean (SD) of accuracy rates of comprehension questions. 


AK condition  SeK condition 


Q1: ASD 93.18 (25.35) 98.86 (11.00) 
(the content ofthe context) TD 98.21 (13.30) 100 

Q2: ASD 89.77 (30.47) 98.86 (10.10) 
(knowledge of S) TD 98.21 (13.30) 99.11 (9.45) 
Q3: ASD 94.32 (23.28) 98.86 (10.70) 
(knowledge of A) TD 99.11 (9.45) 100 


Notes: A and S denote the addressee and the speaker, respectively. 


6 Beyond -yo and -ne (-yone), the participants produced -nokana, -kana, -no, -ka, -na, -ttesa, -tte, 
-kke, -kashira, -jan, and -janai. These expressions were grouped together as one category (i.e., Other 
SFPs). 
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Figure 2: Proportion of Japanese sentence-final particles (SFPs) produced in each 
condition by a) adults with autism-spectrum disorder (ASD, n = 11) and b) those with typical 
development (TD, n = 14). 


The logistic mixed-effects model analysis of utterances ending with -ne (Table 3) 
revealed significant main effects of Context (estimate = 2.608, SE = .624, z = 4.178, 
p < .001) and Group (estimate = -2.489, SE = 1.058, z = -2.352, p = .019) but no sig- 
nificant interaction between the two factors (estimate = 1.023, SE = 1.119, z = .914, 
p = .361). The results indicated that the frequency of -ne in the AK condition was 
significantly higher than in the SeK condition and that the ASD participants pro- 
duced -ne less frequently than those with TD. 
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Table 3: Fixed effects of logistic mixed-effects model analysis on the production 
of utterances ending with -ne. 


estimate SE z P 
(Intercept) -2.282 .481 -4.748 «.001 
Context 2.608 .624 4.178 «.001 
Group -2.489 1.058 -2.352 .019 
Context x Group 1.023 1.119 .914 .361 


Notes: Final model = glmer(-ne ~ Context * Group + (1| Participant) + (1|Item), 
family = binominal). 


As for -yo, the logistic mixed-effects model analysis (Table 4) indicated a significant 
main effect of Context (estimate = —4.662, SE = .620, z = -7.514, p < .001) anda significant 
interaction between Context and Group (estimate = 1.609, SE = .678, z = 2.371 p = .018), 
but the main effect of Group was not significant (estimate = —.102, SE = .664, z = —.153, 
p = .878). The frequency of -yo in SeK was significantly higher than that in AK. Given 
the post-hoc comparisons (Table 5), ASD participants used -yo significantly more fre- 
quently than those with TD in the AK condition (estimate = -1.507, SE = .695, z = -2.168, 
p = .030), whereas the difference in the SeK condition was not significant between the 
TD and ASD groups (estimate = .102, SE = .644, z = .153, p = .878). 


Table 4: Fixed effects of logistic mixed-effects model analysis on the production 
of the utterance ending with -yo. 


estimate SE z P 
(Intercept) 1.934 .495 3.908 «.001 
Context -4.662 .620 -7.514 <.001 
Group -.102 .664 -.153 .878 
Context x Group 1.609 .678 2.31 .018 


Notes: Final model = glmer(-yo ~ Context * Group + (1| Participant) + (1|Item), 
family = binominal). 


Table 5: Results of multiple comparisons of the production of the utterance 
ending with -yo. 


estimate SE z P 


AK/TD vs AK/ASD -1.507 .695 -2.168 .030 
SeK/TD vs SeK/ASD 102 644 .153 .878 


Chapter 10 Pragmatic atypicality of individuals with autism spectrum disorder — 193 


4 Discussion 


This study found differences in the use of the Japanese SFPs -ne and -yo under the 
application of the ODCT between native Japanese speakers with ASD and TD. It con- 
firmed the patterns reported in previous case studies (Satake and Kobayashi 1987; 
Watamaki 1997) and revealed additional details. Adult participants with ASD pro- 
duced SFPs less often than those with TD under AK and SeK conditions, especially 
-ne. This tendency accords with previous studies reporting the lack of use of SFPs, 
especially —ne, in conversations by children with ASD (Satake and Kobayashi 1987; 
Watamaki 1997). In contrast, adults with ASD overused -yo in the AK conditions, 
and their TD counterparts preferred -ne. Evidently, this overuse of -yo has not been 
reported in previous studies. The ASD and TD groups, however, produced more -ne 
inthe AK than SeK condition and more -yo in the SeK than AK condition. Thus, adult 
native Japanese speakers with ASD acquire the use of the SFPs -ne and -yo per the 
context to some extent, unlike children with ASD, who seldom used SFPs (Satake 
and Kobayashi 1987; Watamaki 1997). 

The design of ODCT allowed for interpreting differential patterns of SFP use 
between people with ASD and TD. The patterns observed in this study reflect their 
preferences in social language use unobscured by the challenges of people with 
ASD in inferring others’ knowledge. Prior studies suggest that people with ASD 
have atypicality in understanding others’ mental states, which are implicitly pre- 
sented in context (Baron-Cohen 1995; Baron-Cohen, Leslie, and Frith 1985; Frith 
2001; Senju et al. 2009). To avoid this possible confounding factor, the ODCT was 
designed to ensure that explicit information was provided about the knowledge 
of characters in the context phase. Additionally, the study utilized comprehension 
questions of the knowledge status of the characters, by which erroneous trials were 
excluded from the analysis. Hence, the differential patterns of SFP use between 
ASD and TD should not be attributed to the failure to understand context informa- 
tion by ASD. 

Assuming ASD and TD groups have the same understanding of the given con- 
texts, groups should have different strategies for reflecting the contextual informa- 
tion in linguistic expressions. First, individuals with ASD may have different lexical 
or structural representations of SFPs than those with TD. The overall frequency of 
-yo did not differ between the two groups, but individuals with ASD used -ne less 
frequently than those with TD. It corresponds to a previous report that an ASD 
child who does not use SFPs naturally can learn to use -ne and -yo through training, 
and that they preserve the usage of -yo but decline that of -ne in natural conversa- 
tions (Matsuoka, Sawamura, and Kobayashi 1997). As in Miyagawa (2022: 138-195) 
based on Matsuoka, Sawamura, and Kobayashi (1997), individuals with ASD may 
have similar knowledge of -yo as those with TD, but atypical knowledge of -ne (see 
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Endo in press and Miyagawa 2022: 138-195 for a detailed analysis of the differ- 
ences in the syntactic representation of -yo and -ne between ASD and TD). Second, 
individuals with ASD may have a biased knowledge of using language per context. 
Focusing on context, the atypical usages of -ne and -yo by the ASD group were found 
in the AK condition. In this condition where the addressee knows the proposition, 
those with ASD used -ne and -yo to almost the same degree, while their TD counter- 
parts preferred -ne to -yo. Consistent with some studies indicating that individuals 
with ASD tend to take an egocentric context or perspective in pragmatic language 
processing (Deliens et al. 2018; Kissine 2012), the native Japanese-speaking partici- 
pants with ASD in this study may have established knowledge of language use in an 
egocentric context (SeK condition) but have limited knowledge in a context where 
A's perspective must be considered (AK condition). It may induce a high cognitive 
load for SFP choices in the AK condition, thereby resulting in poorly controlled SFP 
use. Third, individuals with ASD may have different goals from their TD counter- 
parts in social communication. Given the attitudes each SFP is supposed to express, 
the less-frequent use of -ne in the ASD group may have a lower intention to share 
a proposition with others than the TD group.’ As for -yo in the AK condition, those 
with ASD may be more inclined to emphasize S' knowledge of the proposition to 
A than those with TD, regardless of whether A already knows. Such inappropriate 
use of self-insisting -yo may induce an undesirable impression that those with ASD 
ignore or do not consider the mental states of others. These distinctive attitudes 
by individuals with ASD may be related to the atypicality of social communication 
and interaction. 

This study has four limitations that future studies must probe. The first is the 
small number of participants. The study had 14 TD participants and 11 ASD partic- 
ipants. Future studies must recruit more participants with stricter control of par- 
ticipants' attributes to draw clearer conclusions, as we did not thoroughly control 
for age, intelligence, education experience, and types and degree of the symptoms 
of ASD or other related impairments. Though some studies suggest subgroups of 
language impairments within ASD (Kjelgaard and Tager-Flusberg 2001), this study 
did not consider the variations in SFP usage among the ASD group. Therefore, as 
the next step, future studies must examine how individuals are differentiated in 
SFP usage within the ASD population. Second, this investigation of Japanese SFP 
usage focused on -ne and -yo, but there are many other SEPs in Japanese. Notably, 


7 Atraining study for a Japanese child (Matsuoka, Sawamura, and Kobayashi 1997) reported that 
the accuracy of -ne is retained over a long period, but the frequency of -ne in daily conversations 
decays rapidly. It cannot be concluded whether this -ne use stems from atypicality in knowledge of 
SFP, communicative intention, or both. 
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the right periphery of the Japanese clause allows speakers to utter various types of 
elements, including other types of SFPs and modal particles to express the speak- 
er's intentions and attitudes toward the sentence (Uyeno 1971). Thus, further 
research must encompass other sentence-final expressions to elucidate the prag- 
matic atypicality in native Japanese speakers with ASD. Third, there is a lack of 
analysis of the prosody of ASD speech when using SFPs. Individuals with ASD 
have prosodic problems in production and comprehension (Shriberg et al. 2001), 
unlike TD speakers who change the functions of the SFP per prosody (Oshima 
2014a, 2014b). Lastly, the mechanisms underlying such individual differences in 
SFP use must be probed. Many studies suggest that pragmatic problems in individ- 
uals with ASD regard their atypicality of cognitive traits, such as joint attention 
(Charman et al. 2003), perspective taking (Loveland 1984), theory of mind (Bar- 
on-Cohen et al. 1988), and empathy (Watamaki 1997). Regarding the atypical use 
of SEP in individuals with ASD, several hypotheses hinge on the relation between 
linguistic representations and cognitive characteristics (Endo in press; Miyagawa 
2022: 138-195; Watamaki 1997). It is necessary to consider their cognitive charac- 
teristics to clarify how the atypicality of neurodevelopment in ASD relates to atyp- 
ical SFP usage in verbal communication, and ultimately the pragmatic atypicality 
of ASD in general. 


5 Conclusion 


The present preliminary study attempted to differentiate the uses of Japanese SFPs 
-ne and -yo between native adult speakers with ASD, and those with TD by an ODCT. 
Adults with ASD used SFPs -ne and -yo in a typical manner to a certain degree, but 
the frequency of their use of SFPs was different from that of those with TD. Relative 
to TD speakers, ASD speakers less frequently produced -ne, which is typically used 
to share a proposition with another, and they excessively produced -yo—which typ- 
ically shows the speaker's insistence—in contexts where this was considered inap- 
propriate. To the best of our knowledge, this study was the first to analyze atypical 
uses of SFPs in adults with ASD through statistical group comparisons. The current 
preliminary study shows that the prospects for future large-scale investigations are 
promising. 
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Chapter 11 

Auditory comprehension of Japanese 
scrambled sentences by patients with 
aphasia: An ERP study 


1 Introduction 


Aphasia is an acquired neurogenic language disorder resulting from brain injury 
that typically affects the language areas in the left hemisphere. The most common 
types of aphasia are Broca's aphasia (BA) and Wernicke's aphasia (WA), diagnosed 
based on their performance in verbal production and comprehension, with refer- 
ence to brain lesions individuals with these types of aphasia experience. BA is a 
non-fluent aphasia type, where patients have mild or moderate difficulty under- 
standing complex grammar and severe impairments in speech production. BA 
patients (BAP) typically suffer damage in the anterior portion of the left hemi- 
sphere, including the left inferior frontal gyrus (IFG) known as the Broca's area 
(Dronkers et al. 2007). WA is a type of fluent aphasia, where patients experience 
poor verbal comprehension and fluently utter meaningless speech. The damage 
of WA patients (WAP) is typically in the posterior position of the left hemisphere 
(Goodglass and Kaplan 1972; Yamadori 1985), including the left posterior superior 
temporal gyrus (pSTG), known as Wernicke's area (Binder 2015). Nevertheless, 
WAP’s lesions are not necessarily limited to those regions. Recently, a dual-stream 
model for auditory language processing has been supported. The dorsal pathway 
from the STG to the premotor cortex via the arcuate and superior longitudinal fas- 
cicle subserves specifically for sensory-motor mapping of sound to articulation, 
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whereas the ventral pathway connecting the middle temporal gyrus (MTG) and 
ventrolateral prefrontal cortex (vIPFC) via the extreme capsule is responsible for 
auditory comprehension (e.g., Hickok and Poeppel 2007; Saur et al. 2008). Thus, BAP 
and WAP, suffering anywhere in these pathways, should have difficulty processing 
syntactically complicated sentences to a certain degree, regardless of the extent to 
which their speech performances seem fluent. 

The complexity of syntactic processing increases per word order changes if lan- 
guages have flexible word orders. The Japanese language is assumed to have the 
canonical sentence (CS) ordered subject (S), object (O), and verb (V), while allowing 
another scrambled sentence (SS) ordered OSV (Shibata et al. 2006). From Figure 1, 
relative to CS, SS has the initial O as a filler and the original position of the O in CS as a 
gap. When processing the Japanese SS, native speakers cannot specify the sentence as 
SS only by the initial O presentation because the given sentence may be a null-subject 
sentence (i.e., OV). They can specify SS after they are presented with an S, instead of 
a verb phrase (VP), following an O. During this time, they are required to use greater 
working memory to retain the O as a filler until the original gap position. This theo- 
retical assumption has been supported by previous psycho- or neurolinguistic studies 
of Japanese sentence processing, reporting that SS induces a higher processing load 
than CS (e.g., Hagiwara et al. 2007; Koizumi et al. 2014; Tamaoka et al. 2005). 


Canonical sentences (CS) Scrambled sentences (SS) 
A m S: sentences 
s ; NP : noun phrase 
aei N wës pon VP : verb phrase 
Taró-o V : verb 
Sr? v T. filler V ° M sa t:trace 
Tomoko-ga NP-o Y Tomoko-ga xp Y 
» : home-ta 
Taró-o home-ta 
Eap 
Tomoko-ga Taró-o home-ta Taró-o0 ` Tomoko-ga home-ta 
Tomoko-NOM  Taro-ACC praise-PST Taro-ACC Tomoko-NOM  praise-PST 
"Tomoko praised Taro." “Tomoko praised Taro." 


Figure 1: Syntactic structures of Japanese canonical and scrambled ordered sentences (Adapted from 
Shibata et al. 2006 with permission). S = sentence; NP = noun phrase; VP = verb phrase; V = verb; 
t= trace; NOM = nominative; ACC = accusative; PST = past. 


The time course of the neural basis for processing disadvantage in Japanese SS 
has been attested in healthy young adult native speakers, utilizing event-related 
potentials (ERP) from electroencephalography (EEG). Reportedly, SS elicited greater 
positivity than CS approximately 600 ms after onset (i.e., P600 effect) and greater 
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sustained left anterior negativity (Ueno and Kluender 2003; Wolff et al. 2008; Yano 
and Koizumi 2018; Yasunaga et al. 2015). Yano and Koizumi (2018) report that a 
P600 effect occurred at the presentation of the second noun phrase (NP), where 
the gap position appears in SS. Wolff et al. (2008) find that Japanese SS elicited a 
broadly distributed positivity after about 200-350 ms after the case marker of the 
second NP was auditorily presented. 

The processing of SS is influenced by the semantic roles of NPs, in addition 
to the syntactic complexity. It is especially prominent when the roles of NPs are 
semantically reversible, such as when NPs are animate, allowing them to be inter- 
preted as either the agent or patient of V (Richardson, Thomas, and Price 2010). 
Psycholinguistic studies show that semantically reversible sentences induce a par- 
ticularly higher processing load for SS than CS than non-reversible sentences (e.g., 
one NP is animate and the other is inanimate). The effect of ambiguity of seman- 
tic roles for processing SS has been demonstrated in accusative (e.g., Ide, Terao, 
and Kiyama 2021 for Japanese) and ergative (e.g., Kiyama et al. 2013 for Kaqchikel 
Maya; Emura 2023 for Central Yupik) languages; it is supported by the fact that the 
syntactic analysis of case marking patterns requires a higher processing load when 
multiple NPs are semantically ambiguous. 

Regarding sentence processing in patients with aphasia, Wassenaar and Hagoort 
(2005) report that their native Dutch-speaking BAP revealed a P600-like component 
for violating word categories in sentences, but the effect was reduced and delayed 
relative to their healthy counterparts. An earlier behavioral study suggests that 
Japanese-speaking patients with aphasia understand CS easier than SS (Hagiwara 
and Caplan 1990). However, the extent to which native Japanese-speaking patients 
with aphasia have severe problems in processing the SS in semantically reversible 
sentences is unknown in ERP. 

From functional magnetic resonance imaging (fMRI) findings, there is some 
evidence of the neural basis for Japanese and other language's sentence processing 
that reveals stronger activation in the left IFG, including Broca's area for process- 
ing SS (Kinno et al. 2008; Koizumi and Kim 2016; Pallier, Devauchelle, and Dehaene 
2011). A lesion study (Kinno et al. 2009) of glioma patients supports the assumption 
that the ventral pathway connecting the pSTG, MTG, and IFG in the left hemisphere 
is crucial to processing the SS. However, it remains an open question as to how 
patients with aphasia have difficulty with SS per the aphasia type in any language. 
This chapter is the first study to conduct an ERP experiment to compare how native 
Japanese BAP and WAP process semantically reversible CS and SS. It develops an 
effective training method of syntactic processing forJapanese patients with aphasia, 
especially for those who can understand sentences with simple structures but have 
difficulty understanding sentences with complicated structures that require an 
accurate analysis of case particles. 
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We hypothesized that BAP and WAP have special difficulty understanding 
semantically reversible SS, which causes a reduced and delayed P600 effect for SS 
relative to the healthy controls (HC). Regarding the difference between BAP and 
WAP, we could not propose any specific hypotheses given the lack of previous 
experimental studies. We expected that there were some differences between BAP 
and WAP but did not have a specific prediction about how they differ. Unlike most 
of the above-mentioned ERP and fMRI studies, we conducted an ERP experiment 
for processing SS in aphasia through auditory modality because most native Jap- 
anese-speaking patients with aphasia typically rely on spoken rather than written 
communication in their daily lives. It is urgent to reveal how these patients hear 
when processing the CS and SS rather than how they read them to utilize the 
expected neurolinguistic findings to develop the efficient rehabilitation of aphasia. 


2 Methods 
2.1 Participants 


Three BAP (one woman, mean age - 54.30 years, SD - 3.09, range: 50—57 years), six 
WAP (one woman, mean age = 53.30 years, SD = 9.53, range = 35-63 years), and 18 
age-matched adults as HC (10 women, mean age = 55.28 years, SD = 7.26, range = 
39-66 years) participated in this experiment. The participants’ native language was 
Japanese. All participants had normal hearing and normal or corrected-to-normal 
vision without signs of hemianopia or spatial neglect. They were right-handed or 
premorbidly right-handed according to a questionnaire based on the Japanese 
version of the Edinburgh Handedness Inventory (Oldfield 1971). AU participants 
were assessed as being at a typical level of fluid intelligence per the Japanese stand- 
ardized version of the Raven's Colored Progressive Matrices (RCPM, Maximum 
score = 36, Raven 1962; Sugishita and Yamazaki 1992, see Tables 1 and 2). From 
Figure 2, the aphasia patients' lesions were caused by any type of cerebrovas- 
cular accident that spread throughout the left hemisphere of the cortex. BAPs' 
lesions were spread in the frontal areas, whereas WAPs' lesions were generally 
extended to the temporal area from the frontal area. This study was approved by 
the Institutional Review Board of Tohoku University at South Kawauchi Campus, 
Miyagi, Japan, and the Ethics Committee of General Rehabilitation Mihono Hospi- 
tal, Aomori, Japan. All participants provided written informed consent before the 
experiment, which was conducted per the Declaration of Helsinki, and received 
compensation for their participation. 
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Table 1: Demographic data of participants with aphasia. 


BAP, n=3 WAP, n=6 HC, n= 18 
Mean age (years) 54.33 (3.09) 53.33 (9.53) 55.28 (7.26) 
Sex ratio (women: men) 1:2 1:5 10:8 
Education (years) 13.50 (1.78) 11.50 (1.12) 14.83 (2.36) 
RCPM 34.33 (1.70) 33.00 (1.91) 32.94 (2.90) 


Notes: Standard deviations are shown in parentheses. BAP = Broca’s aphasia 
patients; WAP = Wernicke’s aphasia patients; HC = healthy controls; 
RCPM = Raven’s Colored Progressive Matrices (maximum score = 36). 


Table 2: Clinical characteristics of participants with aphasia. 


Participant — Age/Sex TypeofCVA Lesion site STA ` WAB.II 
BAP1 57/M ICH L. putamen II 9.1 
BAP2 56/M SAH L. frontal lobe II 8.75 
BAP3 50/W CI L. insula.L. posterior frontal lobe ` IV 9.45 
WAP1 63/M CI L. temporal lobe . L. putamen IV 9.6 
WAP2 59/M ICH L. putamen IV 9.6 
WAP3 58/M CI L. deep cerebral white matter II 8.5 
WAP4 47/W SAH L.temporal lobe II 8.1 
WAP5 58/M ICH L.putamen II 9.05 
WAP6 35/M ICH L. putamen II 9.6 


Notes: BAP = Broca’s aphasia patients; WAP = Wernicke’s aphasia patients; M = man; W = woman; 
CVA = cerebrovascular accident, ICH = intracerebral hemorrhage; SAH = subarachnoid hemorrhage; 
CI = cerebral infarction; L = left; STA = Syntactic Processing Test of Aphasia: Revised, whose level 

II refers to an understandable level of auditory comprehension of canonical sentences, III or IV 
refer to understandable levels for comprehending scrambled sentences. WAB.II = Auditory verbal 
comprehension in Western Aphasia Battery (maximum score = 10). 


2.2 Stimuli 


We created 48 Japanese transitive sentences, all of which were semantically re- 
versible sentences, including an animate S (agent) and O (patient), as exemplified in 
(1). In each stimulus sentence, we used Japanese common given names with three 
morae for the NPs and a transitive V with three to five morae in the past tense. The 
gender of the name is always different between S and O in a sentence. For partic- 
ipants to process the sentences more naturally, a modal auxiliary “rasi” (meaning 
*seemingly") was followed by the VP, and a temporal adverb phrase (AdvP) was 
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BAP1 BAP2 WAP1 WAP2 WAP3 


= 


BAP3 WAP4 WAPS WAP6 L 


Figure 2: Lesions of BAP and WAP, imaged by computed tomography or magnetic resonance imaging. 
BAP = Broca’s aphasia patient; WAP = Wernicke’s aphasia patient; R = right; L = left. 


inserted between S and O. Further, we prepared questions to check whether the 
participants correctly comprehended the content of each stimulus sentence. Each 
comprehension question concerned the content of any of the two NPs, VP, or AdvP 
such that participants paid proper attention to every phrase of the stimulus sen- 
tences. Given participants’ reduced verbal working memory, the comprehension 
questions were prepared to be shorter sentences as much as possible. 

The stimulus sentences were recorded phrase by phrase by a female native 
standard Japanese speaker who had a long career as a speech-language-hearing 
therapist; the sentences were recorded at a slow speed for the typical adult native 
speakers. The comprehension questions were recorded by a native male Japanese 
speaker. The duration of each NP was trimmed to 980 ms using Praat (version 
6.0.43, Boersma and Weenink 2018) and Audacity (version 2.2.2, Free, open-source, 
cross-platform audio software). Moreover, the mean pitch between S and O was not 
significant (t = 1.092, p = 0.278, d = 0.111, 95%CI [-1.237, 4.260]). All stimulus sen- 
tences and comprehension questions are presented in Appendices A and B. 


(1) Stimulus sentence! 
a. CS (the word order of SOV) 
Tomoko ga sensyü no  nitiyóbi Tarô o home 
Tomoko NOM lastweek GEN Sunday Taro ACC praise 
ta rasi 
PST seemingly 
“It seems that Tomoko praised Taro last Sunday." 


1 NOM [nominative], ACC [accusative], GEN [genitive], PST [past]. 


Chapter 11 Japanese scrambled sentences by patients with aphasia === 207 


b. SS(the word order of OSV) 
Tarô o sensyû no  nitiyóbi Tomoko ga home 
Taro ACC lastweek GEN Sunday Tomoko NOM praise 
ta rasi 
PST seemingly 
“It seems that Tomoko praised Taro last Sunday." 


2.3 Procedure 


Participants sat in front of a computer screen (27" I-O DATA LCD-MQ271XDB 
monitor, Kanazawa, Japan) and underwent an EEG recording while they listened to 
the 48 stimuli with either CS or SS via a loudspeaker (Bose, Companion 2 computer 
speakers, Framingham, USA).In each trial, as in Figure 3, they were first visually 
presented with a fixation jittered between 3000 and 7000 ms, and a 2-1 countdown 
(1000 ms for each) followed by a fixation presented for 500 ms. Next, they were 
auditorily presented with the stimulus sentence while the fixation remained on the 
screen. Then, 1000 ms after the sentence presentation, they were auditorily pre- 
sented with the comprehension question, to which they were subsequently asked 
to respond by pressing the “Yes” or “No” button (Cedrus, RB Series Response Pads 
RB-540, San Pedro, USA). The ERP trigger was set at the onset of the second NP (i.e., 
the O in the CS and S in SS). In each NP, taking 980 ms on average, a case marker -ga 


š Sentence 
Inter-stimulus presentation Question Response 
interval Count down (female voice) (male voice) 


ER me 
3000-7000 1000 1000 Mean 1410 Self-paced Time (ms) 
a a DN 3 AdvP 2nd NP VP 


T 


cs |Tomoko-ga | |sensyü -no | jnitiyobi | Taró-o |... home-ta | frasi 
|Tomoko-NOM| |lastweek-GEN | [Sunday | |Taro-ACC ^ | |praise-PST | seemingly | 
“0 Ce e n — — x SEL 
SS ` [Taró-o | |sensyü -no | nitiyóbi | {Tomoko-ga | [home-ta | Jjrasi » 
Taro-ACC | |last week-GEN Sunday \Tomoko-NOM) |praise-PST | (seemingly 
L—!1 ip J.I TD eeh (CH ene e eal 
500 980 100  Mean923 100 Mean 710 100 980 100 600-1000 100 600 200 (ms) 
pulse 


Figure 3: Experimental procedure. CS = canonical sentences; SS = scrambled sentences; NP = noun 
phrase; AdvP = adverb phrase; VP = verb phrase; NOM = nominative; ACC = accusative; GEN = genitive; 
PST - past. 
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(NOM) or -o (ACC), which notifies the grammatical role of the NP, was presented at 
approximately 700 ms after the onset of the second NP presentation. The partici- 
pants underwent seven practice trials before the experiment. During the practice 
time, we adjusted for enough volume to listen to the stimuli. The procedure took 
approximately 40 minutes, excluding the electrode preparation. Participants took a 
short break every eight minutes. Python ver. 2.7.3 was used to present stimuli and 
obtain behavioral data. 


2.4 Electroencephalographic data acquisition, 
preprocessing, and analysis 


The EEG was recorded from NuAmps (A COMPUMEDICS NeuroScan, Texas, USA) 
using 29 Ag/AgCl electrodes mounted in an elastic cap (EasyCap, Munich, Germany) 
per the 10/20 system (Jasper 1958). Two additional electrodes were attached to the 
upper orbital ridge and external canthi of the left eye to monitor eye movements 
and blink artifacts. The online reference was set to the average of all electrodes, 
and the EEGs were re-referenced offline to the average value of the earlobes. The 
impedances of most electrodes were kept below 10 kQ. Amplified analog voltages 
were digitized at 1000 Hz with a system bandpass filter between zero and 200 Hz. 

The data preprocessing was processed offline using EEGLAB (Delorme and 
Makeig 2004) in MATLAB (8.6.0. 267246 [R2015b]) in the following procedures. First, 
the EEG data were down-sampled to 250 Hz and high pass-filtered with a cutoff of 
one Hz. Then, the power line noise was removed from the data using the Clean- 
Line plugin of EEGLAB. Artifact subspace reconstruction was performed to remove 
high-amplitude artifacts (Mullen et al. 2015). Next, bad channels were interpo- 
lated, and data were re-referenced to a common average reference. The adaptive 
mixture independent component analysis (Palmer et al. 2007) was performed for 
continuous EEG data to eliminate remaining periodical artifacts. Segmentation was 
selected from -700 to 1600 ms around the triggers. Furthermore, independent com- 
ponents with less than a 7096 chance of being derived from brain activity and an 
estimated residual of more than 1596 were removed by the IC label. The group anal- 
ysis was conducted using the Monte Carlo permutation test (significance level 5 96) 
with cluster correction. We analyzed the ERPs focusing on the second NP. AU trials 
were analyzed regardless of whether the questions were correct. The baseline was 
set to onset the posterior 100 ms of the second NP. We analyzed each group of BAP, 
WAP, and HC separately. 
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3 Results 
3.1 Behavioral data 


From Table 3, HC showed good performance in the comprehension task, with a 
mean accuracy of 94.33% (SD = 5.23) in CS and 90.97% (SD = 6.59) in SS. However, BAP 
and WAP showed lower performances than HC, as the mean accuracies were 74.30% 
(SD = 7.08) for CS and 69.45% (SD = 8.73) for SS regarding BAP and 74.35% (SD = 20.32) 
for CS and 74.65% (SD = 11.87) for SS regarding WAP. Apparently, BAP had difficulty 
comprehending SS, although the differences were not statistically comparable given 
the biased number of participants across the groups. As the content comprehension 
task aimed to let the participants pay proper attention to the stimulus sentence, the 
erroneous trials were not excluded from the subsequent EEG analysis. 


Table 3: Mean accuracy rate of comprehension questions (96). 


BAP, n=3 WAP, n=6 HC, n=18 
cs 74.30 (7.08) 74.35 (20.32) 94.33 (5.23) 
SS 69.45 (8.73) 74.65 (11.87) 90.97 (6.59) 


Notes: Standard deviations are shown in parentheses. CS = canonical 
sentences; SS = scrambled sentences; BAP = Broca's aphasia patients; 
WAP = Wernicke’s aphasia patients; HC = healthy controls. 


3.2 Electrophysiological data 


Figure 4 shows the ERP results of the HC, BAP, and WAP processing of the second NP of 
CS and SS. In HC (Figure 4A), a significant difference was observed in the 900-950 ms 
time window after the onset of the second NP, indicating a larger positivity for SS than 
CS on the bilateral frontal, temporal, parietal, and occipital regions (FC5, T7, F3, FC1, 
C3, CP1, Fz, FCz, Cz, Pz, F4, FC2, C4, CP2, P4, 02, F8, FC6, T8, and CP6). Regarding BAP 
(Figure 4B), similar to HC, a significant difference between CS and SS was observed 
in the 900—950 ms time window after the onset of the second NP. BAP showed HC-like 
positivity for SS in the bilateral frontal and parietal regions (F3, FC1, Fz, FCz, Cz, FA, 
and FC2.), but the number of significant electrodes was lower than that of HC. WAP 
revealed no significant differences in any electrodes, unlike HC and BAP. They did not 
induce any deflection in any time window between CS and SS (Figure 4C). 
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(A) HC (n = 18) 


900-950ms after onset the 2nd NP Mean ERPs of the 2nd NP 
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(B) BAP (n = 3) 
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Figure 4: ERPs for hearing Japanese semantically reversible canonical sentences (CS) and scrambled 
sentences (SS) by native Japanese-speaking Broca’s aphasia patients (BAP) and Wernicke’s aphasia 
patients (WAP). In each panel, the time windows with significant differences are indicated with gray 
areas. Electrodes in red showed significant differences (p < 0.05 with cluster-based permutation test) 
between CS and SS. Panels (A), (B), and (C) indicate the findings of healthy controls (HC), BAP, WAP, 
respectively. S = subject; O = object; V = verb. 


4 Discussion 


This study attempted to examine the neural basis for auditory comprehension of 
syntactically and semantically demanding sentences in patients with aphasia. The 
hypothesis that BAP and WAP would elicit reduced and delayed P600 effects for SS 


Chapter 11 Japanese scrambled sentences by patients with aphasia —— 211 


relative to CS was partially supported. Only BAP yielded a positive ERP component 
for SS, but the timing was not delayed relative to the HC. The positivity shown in 
the HC was also considerably late for a P600, peaking around 900-950 ms after the 
second NP presentation for some reason, such as the modality difference with the 
previous ERP studies of Japanese sentence processing (i.e., auditory vs. visual). We 
interpret the positivity as a reflection of an extra processing load for involvement 
in the analysis of SS's complicated syntactic structure. Although BAP indicated an 
HC-like positive ERP component, WAP did not show any significant effects between 
SS and CS. As a reduced and delayed P600 effect for a syntactic violation found in 
Dutch BAP (Wassenaar and Hagoort 2005), the Japanese BAP's reduced positive ERP 
component for processing SS suggests that they are less sensitive to the complex 
syntactic structure in semantically ambiguous sentences than the HC. WAP, who 
revealed no ERP effect, may have more severe insensitivity to this kind of high- 
er-level auditory syntactic processing. 

The difference in ERP patterns for the semantically reversible SS between BAP 
and WAP was presumably from the different lesions they suffered. The lesions of 
the BAP are within the frontal areas, while those of the WAP extend from the frontal 
to the temporal areas. The finding that no expected ERP effects were found in the 
WAP with wider lesions in the temporal areas replicates the important role of the 
ventral pathway connecting the MTG and vIPFC via the extreme capsule, which has 
been proposed in the dual-stream model for auditory language processing (Hickok 
and Poeppel 2007; Saur et al. 2008). A previous fMRI study of visual comprehension 
of Japanese sentences in healthy young adults (Kim et al. 2009) shows that the ante- 
rior portion of the dual pathways, including the IFG and the dorsolateral prefron- 
tal cortex in the left hemisphere, principally serves for processing the complicated 
syntactic analysis of SS. A lesion study (Kinno et al. 2009) also supports that regions 
responsible for syntactic processing through the visual modality are restricted in 
the anterior part of the frontal lobe, mainly including the IFG pars opercularis 
and pars triangularis. The lesion study of auditory sentence comprehension adds 
support for the function of the ventral pathway, including the frontal and temporal 
lobes, for efficient auditory language comprehension. 

Nevertheless, this study has several limitations that will be addressed in future 
studies. First, employing many patients, we must conduct an in-depth analysis of 
each patient’s anatomical characteristics in light of the dual pathways for auditory 
language processing based on their behavioral characteristics of linguistic compre- 
hension. It allows for proposing more plausible interpretations of the ERP effects 
for semantically reversible SS and CS through the auditory modality. There is even 
more considerable variation in lesions among the actual population of patients 
with aphasia because Wassenaar and Hagoort’s (2005) BAP affects many different 
regions not within the frontal and temporal lobes, extending to the insula and the 
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internal capsule. Second, given that the present comparison between the SS and CS 
only concerned semantically reversible sentences always having animate NPs, it 
remains unclear how the obtained ERP effects for complex syntactic processing can 
change if the Japanese stimulus sentences are presented in a semantically unam- 
biguous way (e.g., animacy of S and O is different), requiring no strict analysis of 
case markers. Third, the stimulus sentences were not presented in the same way as 
natural speech given the accurate acoustic control in the experiment. Although this 
study used the same sound files of NP for presenting S and O with constant pauses in 
between, pitch in naturally spoken sentences generally goes down from the begin- 
ning to the end, helping process complex sentence structures (Wolff et al. 2008). 
The constant prosody among NPs of the stimuli may somewhat impede effective 
sentence processing among participants. Finally, the long-term goal was to apply 
neurolinguistic findings for rehabilitation. We must develop an effective method 
of syntactic training with which patients recover their ability in comprehension 
and production of higher-level syntactic processing per the type of aphasia. The 
current clinical practice of Japanese aphasia treats patients by auditorily present- 
ing a single SS and CS, asking them to select the picture that correctly depicts the 
sentence. Nevertheless, a speaker's use of SS is motivated by the presentation of 
the preceding context (e.g., Imamura 2015). An efficient solution for the patients to 
easily comprehend SS is then to utilize contextual information for stimulus prepa- 
ration because the processing requirement for SS inevitably depends on its context, 
which reflects ERP patterns (Yano and Koizumi 2018) and behavior (Otsu 1994). 


5 Conclusion 


This ERP study of auditory processing of Japanese semantically reversible sen- 
tences with canonical and scrambled word orders in patients with aphasia demon- 
strated that native Japanese-speaking patients with BA, like their healthy counter- 
parts, showed an ERP-P600 effect for processing scrambled word order relative 
to the canonical one. However, those with WA did not show any significant ERP 
effects. This finding suggests that patients with BA, whose lesions are limited within 
the frontal regions, sufficiently analyze case particles to comprehend complex syn- 
tactic structures with semantic ambiguity; whereas those with WA, whose lesions 
extend from the frontal to the temporal lobe, may have a functional disconnection 
for processing it. As per the recent dual-stream model for auditory language pro- 
cessing, the middle temporal lobe in the ventral pathway may be crucial in the 
auditory processing of complex sentences in aphasia patients. 
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Abus “Isa epinb ` WON — Iuso/ns| 20V O0A Aepsan, N35 xesM1se| ` 20V OyOA / WON usoÁñnsl 
Isel e ISOpIS eb dat o EIST? IqoKex ou nÁsues o OQA / eb ISONI OL 

«'J91UIM 3se| Iun Ba peuajeeJui nuiesQ ey} suJaes H. 

Kjpumusses ` Led uajeaiy) WON nuieso Dv !unBəNW JaUIM N3D Jeak 1se| 20V unay / WON  nweso 
{sed ej Isopo eb nweso o lunBəNW nj ou uauohy o lIunBəW / eb nueso 60 

« buluow Áepuə1səÁ xpeq siu uo eyna panies eJyy 3eu sues 3T, 

Kpumuess Isa Aue) WON ey 20v KC? Buiuou N35 Aepiaysak ` 20V OA / WON EIN 
sed e oss eb ey o OXDA ese ou QUh| o ONDA / eb eyy 90 

« UJUOW 3se| JO pue et olusoA aAeHOJ exnsy 1eu SWƏƏS 3T, 

Ajbulwaas ` Led ƏAÓ5ijoj ` WON eynsy 20V  OlusoA pue N39 Yow se; 25v OSO / WON ing 
ise e IsnanÁ eb eynsy o OISOA unkzab ou njabuas o OISOA / eb eynsy Z0 
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(penunuo2) 


."ÁepaeiseK asojaq Aep ay} jo Hujusow eu !tusepel 104 paxooj ueue Jey} Sulaes 3L, 


Aepiaysahk 

Ajbulwaas ` Led ` 01300 WON  !weuey 20v  !usepe| — Duuuou Nän  esJojegfep >Dy I!usepe| / WON  Mueuew 
sed e sebes pb  jueuelN o seper ese ou 103010 o Isepe| / eb  mueuew zZ 

« II ISE] OMLINA pat Oesen ey} SwWaas Y, 

Albulwaas — Led əy WON oese 20v ` oun lle) NAD JeaÁise| ` 20V oun, / WON  oesew 
Ise e Heft eb OPSe|N o EI? ye ou uauohy o oyna / eb oeseW OZ 

«Aepinyes 1se| oyns1əs 1lu oəyel 3eu) suJaes II, 

AlGulwaas — Led uy NON oayeL 20V  oynsies ` Áepimes N39 MASE] 22v ` Oynsas / WON oayel 
IT ej ynbeu eb oayelL o oynəs Iqofop ou nÁsuəs o oyn)s / eb oayel 6l 

« DUIUJOW SIU} NNA dn ayom e3oÁy yey} sulaes 1T, 

Kfpuruses ` Led dnayem ON PO 20V 9bInA Duiuou N35 Aepo 525v apink / WON eyoÁy 
ise e ISOYO eb Giel o apn, ese ou ohy o apna / eb egy 8l 

« Aepsaupa 1se| oxnsilN Dag. DS o30eN Jey} Swaas 1. 

AjBulwaas ` Led ` uxens WON oo 20V ` one Aepsaupau Nän eswise DIV ` one / WON ` OoEN 
sed e IEN eb 0JOEN o ` om IqoÁins ou gAsuas o OymlN / eb OJOEN ZL 

,A9UJUIns 1se| JO OJIUSOA Mes eÁnzey 1eu SUUƏƏS J, 

Kjpumusses  |Sd Up aas ON eÁnzey 22V ` OXIUuSOA Jeuluns N35 Jeak 1se| 20V OJMIUSOA / WON ` eAnzey 
ise ej ` ynyolw eb eÁnzey o KÉIS? meu ou uauohy o OXSOA / eb eAnzey 9| 

,"Áep4a1seK Jo əuunÁep ay) oy!win4 payd!y rusns1y Jeu} SWƏƏS 1], 

Ajbulwaas ` Led WPA WON ` lusnsiy 20V ` oun eunfep NID Aepuaysak — 22v oyuna / WON  Iusnov 
sed e 12} eb Ismy o ` oun nu ou oun o oylwnH / eb ny G| 

« UJUOW 1se| Jo Ə|ppilu ay} nqeue|N papjoos oyeueH 72y} SWƏƏS IL, 

Ajbulwaas ` Led pos WON ` OyeueH 20V  nqeuew əppıu N39  uouijse| 22v  nqeuew / WON  OyeueH 
ise e exis eb oyeueH o nqeuey| unÁzo( ou njabuas o nqeueWw / eb  oxyeueH pl 
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," Aepuo|N 1se| Dap 3e Dale ruuoeN yey) suuəəs 1. 


Mbuuags ` Led eb WON |WOeN 22V Dap Aepuow N35 99M 1se| 22V P|9pild / WON |WOeN 
isea ep ueJiu eb IUOEN o Dap ıqọÁn}ə ` ou nÁsues o Dap / eb IUOEN 8c 

« JSP] Ə1oJəq əuinKÁep ay) oğluna pəllAul I[uəy zey} suuəəs I], 

Aepiaysahk 

Ajbulwaas — Led aa WON Ifuax 20V ` oun) eunfep NID asojaqhep 20v oyuna / WON (ue 
IT e joses eb Ifuay o ` oun) nay ou 1030310 o oun / eb lues Zz 

«Blu 1se| oif padjay oyunf ey} swaas 3L, 

Kpuiuess Isa deu ` WON oyunf 20v out fu N35 Aepiaysak ` "ON olf / WON oyunf 
ISeJ EI aynse} eb oxunf [9 guf nok ou OU [9 QÍ / eb oxunf 92 

«Aepuns 1se| oynzey pasiesd ojoyew Jeu} Swaas 1. 

AjBulwaas ` Led ` geed WON O1OJeW 20V oynzey Áepuns NID eawise| — 20V oynzey / WON ODoyeW 
sed e awoy eb Joyen o  Oxnzey ıqọÁniu ou nÁsuəs o Oynzey / eb OPW SZ 

« DUIUAAA 1se| OY!LUNY 3ufne» nuouIN Jey} SuuƏəs 1[, 

AjGulwaes ` Led Ue WON n4OUIIN 20V oOyuny Buluana N35 KepueiseÁ ` 55V oylwny / WON Moun 
sed ej aeweyn) eb n4OUIIN o oOyuny eyebpÁ ` ou Quni o OylUny / eb mouw t 

« Aepsiny ise] nuuo1ns] pa^el|aq OJON Jey} Swaas 1. 

Ajbulwaes ` Led Əaələq WON OXHON 20v nuns, = Áepsinu] N39 xeaMise| ` 25v  numins| / WON ` OXJUON 
ise ej fuis eb OJON o nwon;  I!goÁnyoul ou nÁsuəs o nwon, / eb OJON EZ 

« UJUOW 3se| jo Buuu baq ay) oxoÁAIA punoj nioqoN 32y} SWƏƏS 1. 

Mbuuuags — Led DUU ` WON ` nJoqoN 20v oyoAIW — Duuufag Nän yowe) DV OXOÁIN / WON nuioqoN 
ise EI Pan) eb nJoqoN o  OXOÁINA unÁzoÁs ou njabuas o OXOÁIN / eb  nioqoN ZZ 
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(penunuo») 


,"Áepae3seK asojaq Aep ay} Jo BulIuəAə ay} o|YSoA HY OYNA JEU} suJaes 3L, 


Aepiaysahk 

Ajourwaes ` Led Ju WON oyn / 22V OIUSOA Duuang N35 3J0Jaq Aep 22V olysoX / WON ON, 
Ise e 23121 eb ODA / o OISOA ejebnÁ ou 103010 o OISOA / eb OXJDA 9€ 

«Ñep Se] eynsy pə1qnop n403es Jey) suuəəs II, 

Ajburwaas ` Led jqnop WON nos / OV eynsy Áepu4 NID eswise| — 22V eynsv / WON nos 
IT e yebein eb nioes / o eynsy IqoÁup| ou nÁsues o eynsy / eb nJoegs SE 

«Aepsany se] e3oÁs pepinb eyna yey} Swaas 3L, 

Afpunueess ` Led epinD ` WON HOA / DWV ejoÁs Áfepsen| N35 yeamyse] ` DV ejoÁS / WON HeXnA 
IT e ISQpis eb Un / o eyoÁgs IqoÁex ou nÁsues o eks / eb Weyn, YE 

«'J9]UIM JSE] IusoÁns| paua1eaJu1 Oy!ay 1eu1 SWƏƏS 3], 

AjBulwaas ` Led uaeJpn WON oan / 22V !YysoAnsy JUM — N35 JeaÁise| — 20v ` !usofns| / WON KEIER 
ise e Isopo eb oylay / o IsoÁn| nj ou uauohy o IsoÁn] / eb oylay EE 

« buluJow ÁAepiuəlsəÁ xpeq siu uo nuieso parue» oyoA JEU} Saas 1. 

AjBulwaes ` Led Ke ` WON ODA / OV  nweso Bulusow N35 Áepiə1səÁ 20V nueso / WON OO, 
sed e 08s eb OQA / o  nuiesQ ese ou Quy o nwesg / eb OQA ZE 


« UJUOW 3se| JO pua eu un bəy əAeB4oJ esy 3eu SWS 1r,, 


Ajbulwaes ` Led  eAiDj0] ` WON ey / 20v Wunfouw pue N35 yuow se; >Dy !un6əW / WON Em 
ise ej IsnanÁ eb eyy / o lunBəN unkzab ou njabuas o lIunBəW / eb eyy LE 
«Aepiaysak asojag Aep au} Jo 1uBiu au} ruexv pa1eu oie, yey) Swaas 1, 
Aepiaysahk 
Ajbulwaes — Led peyey WON osel / 22V DER) juBiu N3D 910Jaq fep 22V waxy / WON oJel 
|seA ep unyu eb Qe / o T niok ` ou 103030 o IER CA) eb gie} Oe 


«Buds siy} pAnsial paniadap OYOWOL yey} SWƏƏS II, 
Ajbulwaas ` Led eap WON oyowol / 25v  eÁnse| buuds N35 Jeak sun 22V eÁns| / ` WON  Oyouio| 
sed e} sewep eb ` oyowo, / o eÁnma| ney ` ou 15010» o eÁma| / eb  oxXouoj 62 
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« le} 4S] Iusepe] Dau OYOAIW Jey) SWƏƏS 1, 


Ajbulwaas — Led əy WON ` oo 20V  I!usepel lle} NAD JeaÁise| — 22v ` I!usepel / WON ` Gol 
ise e Heft eb OXOÁIN o sepel Me ou uauohy o Isepe| / eb ` oo vr 

«Aepanyes sej nuuolns] Wy !uueuel Jeu} SwWaas H. 

ÁApuiueas — Led Mu WON ` nueuejw 20v nuns  Áepımes NID »xeswise Dy nuo)si / WON ueue 
IT e jnbeu eb = Jweuey| o  nuoijn| Iqofop ou nÁsues o nwon, / eb  |ueueW Epy 

« DUJUJOWW siy} Iyseye, dn ayom oyeueH ey} sulaes 1], 

Albulwaas ` Led dnayem WON OyeueH 20V user ` Dunuou N35 kepo} ` Ou spe) / WON  OyeueH 
Ise1 e Bette eb oyeueH o seyel ese ou oh o seyel / eb  oxyeueH zb 

« Aepsaupa 1se| e3oÁy payas runÁy ey} SWƏƏS JL, 

Muss Sd | uxeDs WON junky 20v ejofy Kepseupew N3D xesMjse| ` DV ejd / WON Uni 
ise e (SEU eb junky o Giel Iqofins ou nÁsues o ely / eb junky IF 

,J9UJUIns JSL] JJO OJOLN MES ƏlynA 3e SWƏƏS JJ, 

AjBulwaes ` Led joaes WON aDINA 20V  010EN Jeuuins N35 JeaÁise| ` 20V OPEN / WON HEI 
sed ej — "mo eb alin o 010PN meu ou uauohy o DOEN / eb ann Or 

«Aepsaysaf Jo əuinÁep ay} eAnzey Da) oxnsillN JEU} suess 3r, 

Kpumuses Isa PA WON ` Oneu 20v  eÁnzey eumfep N35 Aepiaysak 33V eÁnzey / WON ` One 
Ise ej 3 eb Oyn3lN o eÁnzey naty ou QUh| o eÁnzey / eb OYA 6E 

« UJUOW 3se| JO e|ppiuu aui oxIusoA pepjoos iusnsiy ey} SwWaas 1. 

Ajbulwaas ` Led plos ` WON !usnsv 20V — OxlusoA əppıu N39 You rse; 225v OxUso, / WON  I!usnuy 
ise ej Is eb Ismy o OŅISOA uní&zņ ou njabuas o OXJSOA / eb In g€ 

«Buds 1se| uo oun) peseup nqeueN ey} SwWaas 1. 

Kjpuiuees Sd aseyd WON  nqeuey| 20V | oun Buds N35 Jeah 1se| 22V oylwny / WON nqeueW 
sed e ayeylo eb nqeueW o ` oun nieu ou uauohy o oun / eb nqeueW zE 
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“sed = Led AZU = N35 
'aAesnooe = 22V ‘@ANeUIWOU = WON ‘aseiud quan = dA ‘asesyd quaApe = dApy ‘asesyd unou = dN 'e»ue1ues pe|qUJeJ5s = SS 'e»uajues |e3IUOUP) = SD :S93ON 


« DUIUAAA 1se| OY!LUNY JYHNed nuouIN Jey} Saas 1. 


Ajulwaas — Led Qe WON oun, / 20v — Oeyel Duuaug N39 fÁepueiseÁ — 25V oye] / WON ` ouni 
ise e} aeweyn} eb oyun, / o oayel eyebnÁ ` ou Quy o oayel / eb On 8p 


"ÁepsJnu | 3se| ons3as pəAəl|əq o nJoul| 3e SWS 1, 


AiGuwaes Sd — e^ejeq WON mouw / DOV  Oynses — Áepsinuj Nän  yaamyse| DW ` Oynsps / WON  nJoulw 


ise ej fuis eb nioulN / o oynjas ` IqoÁmjou) ou nÁsuəs 0 oyn)əs / eb  njioulN ZY 

« UJUOW 3se| Jo Buluu!6əq au oesejy puno] Oy!WINy 72y} Saas 1. 

Ajbulwaas ` Led puj WON ` ouum / OV  Oesew ` Duuufag N35 Yow se; — 25v oese / WON  Oxuny 
ise e} Par) eb oylwny / o OPSe|N unÁzoÁs ou njabuas o OPSeN / eb  oxuny 9p 


«Aepiaysak asojag Aep aui jo Bulusow ay} NIOGON 104 pa»oo| OJON Jeu} SwWaas 1], 
Aepiaysahk 

Ajbulwaas ` Led ` 01300 WON ` ou0N / 20V NJOGON Duiuou! N35 ` aotag ep ODV ` nJoqoN / WON ` OXJON 
sed e isebes eb ONJON / o  hJOQON ese ou 103010 o nJjoqoN / eb OXNJON SY 
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Appendix B 


Comprehension questions 


a. Questions asking the subject 
Taró ga home  masi ta ka? 
Taro NOM praise AUX PST INTER 
"Did Taro praise (Tomoko)?" 


b. Questions asking the object 
Taró o home masi ta ka? 
Taro ACC praise AUX PST INTER 
*Did (Tomoko) praise Taro? " 


c. Questions asking the verb 
Home masi ta ka? 
praise AUX PST INTER 
*Did (Tomoko or Taro) praise? " 


d. Questions asking the adverb phrase 
Sensyü no nitiyóbi no koto  desu ka? 
lastweek GEN Sunday GEN thing AUX- INTER 
“Did it happen last Sunday? ” 


Notes: NOM = nominative; ACC = accusative, AUX = auxiliary; 
PST = past; INTER = interrogative; GEN = genitive. 
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Chapter 12 
Experimental studies on clefts and right 
dislocations in child Japanese 


1 Introduction 


The acquisition of non-canonical word order sentences such as relative clauses and 
clefts in English and other languages has long received much attention in the liter- 
ature (de Villiers et al. 1979; Tavakolian 1981; Friedmann, Belleti, and Rizzi 2009; 
Guasti, Statavrakaki, and Arosio 2012; Bever 1970; Lempert and Kinsbourne 1980; 
Aravind et al. 2016, Aravind, Hackl, and Wexler 2018; among others). Reportedly, 
children often have problems comprehending scrambled sentences that begin with 
objects in Japanese (Hayashibe 1975; Otsu 1994; Sano 2005). This chapter inves- 
tigates whether children have problems with other non-canonical constructions 
beginning with objects such as Japanese clefts (JCs) and Japanese right dislocations 
(JRDs). We investigate three aspects of JCs and JRDs: word order, scope interaction, 
and associations of focus particles. These examinations show that children treat 
JCs and JRDs differently in the first and second aspects while showing similar non- 
adult-like behaviors in the third aspect. Hereafter we will refer to subject clefts as 
SCs, object clefts as OCs, subject right dislocations as SRDs, and object right dislo- 
cations as ORDs, but we will only use JCs and JRDs when the comparison between 
clefts and right dislocations in Japanese is important. 

Section 2 examines Japanese children’s comprehension of JCs and JRDs. Although 
their word orders are similar and objects can appear at the beginning of sentences, 
we show that children treat JCs and JRDs differently: they have problems with JCs, 
particularly with SCs but not with JRDs. Section 3 addresses a difference in the scope 
interaction between negation and the universal quantifier zenbu “all.” It is reported 
that JCs exhibit an anti-reconstruction property with negation (Mihara and Hiraiwa 
2006), but JRDs do not in adult Japanese. The results suggest that children are sen- 
sitive to the differences concerning these (anti-)reconstruction properties. Section 4 
examines children’s associations of focus particles in JCs and JRDs. Children exhibit 
similar incorrect associations of focus particles in JCs and JRDs, and we will discuss 
whether that is based on linear order or hierarchical structures based on c-command 
relations between subjects and objects. Section 5 presents the general discussion and 
conclusion. 


[o] Open Access. © 2024 the author(s), published by De Gruyter. [GS EXE This work is licensed under the Creative 
Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. 
https://doi.org/10.1515/9783110778939-012 
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2 Children’s comprehension of clefts and right 
dislocations in Japanese 


As noted, Japanese children have problems with scrambled sentences beginning 
with objects (Hayashibe 1975; Sano 2005; see Otsu 1994 for the improvement with 
felicitous contexts). Hayashibe (1975) suggests that children may interpret the sen- 
tence-initial patient or theme as an agent (henceforth the Agent-first Strategy), as 
the canonical word order is SOV (s is subject; 0, object; and v, verb) in Japanese. This 
Agent-first Strategy is also reported in English (Bever 1970; Slobin and Bever 1982; 
Abbot-Smith et al. 2017). This section probes whether children’s Agent-first Strategy 
is observed in JCs and JRDs. 


(1) a. Subject Cleft (SO 
[Neko o oikake-teiru no wa] inu (??*ga) da. 
cat ACC chase-PROG C TOP dog NOM COP 
“It is a dog that is chasing the cat." 
b. Object Cleft (OC) 
[Neko ga oikake-teiru no wa] inu (o) da. 
cat NOM  chase-CPROG C TOP dog ACC COP 
“It is a dog that the cat is chasing.” 


(2 a. Subject Right Dislocation (SRD) 
Neko o oikake-teiru yo, inu ga. 
cat ACC  chase-PROG SFP dog NOM 
“(It) is chasing the cat, the dog." 

b. Object Right Dislocation (ORD) 

Neko ga oikake-teiru yo, inu o. 
cat NOM chase-PROG SFP dog ACC 
“The cat is chasing (it), the dog.” 


In JCs (1), the bracketed presuppositional clause comes before the focused element 
(Kamio 1990; Sunagawa 2005). In SC (1a), the subject is focused, and the object 
comes at the beginning. Notably, the focused NP with the nominative case marker 
-ga in SCs shows low acceptability (Hoji 1987; Sadakane and Koizumi 1995; Mihara 
and Hiraiwa 2006; Hiraiwa and Ishihara 2012).! In OC (1b), the object is focused, 


1 Hoji (1987), Sadakane and Koizumi (1995), Mihara and Hiraiwa (2006), and Hiraiwa and Ishihara 
(2012) note that the focused NP with the nominative case marker -ga shows very low acceptability 
in Japanese clefts. Regarding the focused NP with the accusative case marker -o, it seems more 
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and the subject appears at the beginning. At the end of the presuppositional 
clause of JCs, the complementizer no and the topic marker wa appear. 

In JRD (2), the part before the right-dislocated item normally ends with a sen- 
tence-final particle, such as yo, and a pause indicated by a comma. In SRD (2a), 
the subject is right-dislocated, and the nominative case marker -ga is attached. 
In ORD (2b), the object is right-dislocated and the accusative case marker -o is 
attached. The right-dislocated NPs are awkward without case markers or postpo- 
sitions. According to Kuno (1978: 68), right-dislocated elements are elided in the 
first part of the sentence because they are judged recoverable from discourse con- 
texts, and they are produced at the end of sentences for confirmation or to give 
supplementary information. Takami (1995: 160) notes that elements that can be 
right-dislocated in Japanese are items other than focus items. 

Altinok (2020) and Tomioka (2021) propose that JRDs reflect the following strat- 
egy: “communicate the essential part of the informational content of the utterance 
as early as possible” (Tomioka 2021). As per their proposal, the essential part of 
the informational content of the utterance comes first in JRDs. We later discuss 
Altinok’s (2020) and Tomioka’s (2021) analyses in-depth. Here, it suffices to say that 
the information structures of JCs and JRDs seem to be somewhat opposite: In JCs, a 
presuppositional clause appears first, and the focus comes after, whereas in JRDs, 
the essential part of the informational content of an utterance comes first, and the 
non-focus item appears at the end. 

Although the information structures of JCs and JRDs seem to be different, 
their word orders are quite similar, as shown in (1) and (2): OVS and SVO. Given 
that the object comes at the beginning in SCs and SRDs, children may use the 
Agent-first Strategy and misinterpret the sentence-initial object as an agent. 


acceptable than that with the nominative case marker: Hoji (1987) used one question mark (?) and 
Sadakane and Koizumi (1995) used two question marks (??) for their judgements. Hiraiwa and 
Ishihara (2012) noted that the focused NP with the accusative case marker is accepted by some 
speakers, including the authors. 

According to Hoji (1987, 1990) and Hiraiwa and Ishihara (2002, 2012), there are two types of 
Japanese clefts: case-marked and non-case-marked clefts. The most important difference between 
the two is the island sensitivity: the former is sensitive to islands, but not the latter. That is, the 
occurrence of movements is assumed in case-marked but not non-case-marked clefts. However, as 
noted, given that the clefts with the focused NPs with nominative case marker -ga are awkward, we 
dropped the nominative case marker -ga in SCs (i.e. subject clefts) and included the accusative case 
marker -o in OCs (i.e. object clefts) in our experiments in Section 2 and 3. The reason for including 
the accusative case marker -o in the experiments is that we assume the occurrence of movements 
in JCs and address reconstruction effects. 
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Children’s performance of SCs and SRDs may then be much worse than that of 
OCs and ORDs. In this section, we examine children’s comprehension of SCs, OCs, 
SRDs, and ORDs, in addition to scrambling. 


2.1 Previous studies 
2.1.1 Acquisition of clefts in English 


Children acquiring English have problems with OCs around the age of 4 or 5 (Bever 
1970; Lempert and Kinsbourne 1980). In English, in an SC, such as “It is a dog that 
is chasing the cat,” the word order is SVO. In an OC, such as “It is a cat that the dog 
is chasing,” the word order is OSV. English-speaking children may have problems 
with OCs because the object comes before the subject. 

Aravind et al. (2016) and Aravind, Hackl, and Wexler (2018) show that chil- 
dren’s performance becomes much better when felicitous contexts are given. They 
used the truth value judgment task with an illustration(Crain and Thornton 1998). 
Some context was given with the first picture hiding the focus with the black box, 
and the test sentence was given after the black box disappeared. 


(3) a. Context: Look! The dog is chasing something, I wonder what it is. 
b. Test sentence: It is a cat that the dog is chasing. 


Children’s responses in Aravind et al. (2016) were as follows: Matched SCs: 84%, 
Matched OCs: 83%, Mismatched SCs: 82%, Mismatched OCs: 34%. Children’s perfor- 
mance of matched OCs, such as (3), was much better than the results of previous 
studies if the contexts were given with OCs, as in (3). 

Our research group, however, notes that children’s better performance for 
matched OCs may stem from experimental artifacts (Ohba, Sano, and Yamakoshi 
2019). In the matched condition in (3), only two animals appeared in the pictures, 
and the focused animal was covered with a box. Children may respond correctly 
only by hearing the first part of the cleft sentence, “It is a cat.” Therefore, we con- 
ducted the revised experiment with three-animal conditions, as reported in 2.2.” 


2 Some studies examined the acquisition of Japanese clefts, but their results vary per experimental 
method [e.g., the act-out task in K. Sano (1977) and the picture-selection task in Dansako and Mizu- 
moto (2007)]. Thus, we do not go into the details of such results here. 
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2.1.2 Acquisition of right dislocation in Japanese 


Although no study, to the best of our knowledge, has conducted experiments on 
JRDs, some studies examine JRDs in children's naturalistic speech. Sugisaki (2005) 
examines whether wh-phrases appear in right-dislocated positions in children's 
natural speech data. Wh-phrases cannot be right-dislocated in Japanese because 
the right-dislocated items are not focused (Takami 1995; Tanaka 2001)? Sugisaki 
(2005) examines the utterances of two children (Aki: 2;6.15* — 3;0.0, Ryo: 2;4.25 — 
3;0.30) in Miyata corpus (Miyata 2004a, b) in the CHILDES database (MacWhinney 
2000). Table 1 shows the results. 


Table 1: Children’s utterances of subject-object-verb (S)OV and (S)VO sentences 
(Sugisaki 2005: 588). 


Aki Ryo 
(S) OV (SVO (S) OV (S)VO 
(canonical) (i.e. ORD) (canonical) (i.e. ORD) 
Total #of utterances 518 38 252 43 
# of direct object wh-question 185 0 40 0 
% of wh-question 35.7% 0% 15.9% 0% 


Aki produced 38 ORDs and Ryo produced 43 ORDs. Although they produced many 
object wh-questions in (S)OV order, they did not produce any ORDs with direct object 
wh-phrases. This contrast shows that they are sensitive to the constraint on JRDs. 

Dansako (2018) examined SRDs in natural speech data of four children before 
the age of three in the CHILDES database and showed that children produced SRDs 
at the age of two and did not misuse the nominative case marker. 

In summary, according to Sugisaki (2005) and Dansako (2018), children produce 
JRDs at the age of two, and they are sensitive to the constraint on JRDs. However, 
to the best of our knowledge, no prior studies have examined whether Japanese 
children misinterpret SRDs as they do in scrambling by using the Agent-first Strat- 
egy. Therefore, in our experiment, we examine children's interpretations of JRDs, 
especially SRDs, relative to JCs and scrambling. 


3 For more detailed explanations of why wh-phrases cannot occur in right-dislocated positions, see 
Tanaka (2001) and Sugisaki (2005). Further, Yamashita (2010) and Miyata (2018) noted that there 
is a prosodic factor that induces the prohibition of wh-phrases in right-dislocated positions; see 
Yamashita (2010) and Miyata (2018) for details. 

4 The numbers represent the children's ages: years; months. days. 
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2.2 Experiment 
Let us first present our research question and predictions for this experiment: ? 


(4) Research question and predictions 
Do Japanese children apply the Agent-first Strategy to sentences that have 
sentence-initial objects? If so, children are expected to misinterpret not 
only scrambling but also SCs and SRDs, whereas they are not expected to 
misinterpret OCs and ORDs. 


The subjects were 18 Japanese monolingual children (4;7-6;7, mean: 5;6) and 11 
adults. We divided the children into two groups to test JCs and JRDs: the JC group 
(N = 9, 4;7-6;6, mean: 5;5) and the JRD group (N = 9, 4;8-6;7, mean: 5;7). 

Although we followed the truth value judgment task used by Aravind et al. 
(2016) and Aravind, Hackl, and Wexler (2018), the scenarios involved three animals 
and two black boxes to avoid the experimental artifacts. As shown below, the 
context was given orally by the experimenter with the first picture on the computer 
screen (Figure 1). In the first picture, two animals are hidden by black boxes. When 
the second picture was shown, an anime character, Anpanman, appeared beside 
the second picture. The recorded test sentence was given as Anpanman's descrip- 
tion of the second picture. The child was asked to judge whether the recorded test 
sentence was true or false based on the second picture. 

Below are examples of SC, OC, SRD, ORD, and scrambling. In SC (5a), we did not 
include the nominative case marker with the focused subject given that its accepta- 
bility is low in adult speech (see also footnote 1). In OC (5b), the focused object was 
given with the accusative case marker. In SRD (5c) and ORD (5d), the right-dislo- 
cated NPs were given with the case markers. 


(5) (Contexts) 


Dareka ga zousan o aratte-ite, 
someone NOM elephant ACC wash-PROG 
zousan ga dareka o aratte-iru yo. 


elephant NOM someone ACC wash-PROG SFP 


5 Some parts of this experiment were originally reported in Shimada et al. (2020). 
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(Sample test sentences) 

a. SC 
Zousan o aratte-iru no wa usisan da. 
elephant ACC wash-PROG C TOP cow COP 
“It is a cow that is washing the elephant.” 

b. OC 
Zousan ga aratte-iru no wa  usisan o da. 
elephant NOM wash-PROG C TOP cow ACC COP 
“It is a cow that the elephant is washing.” 

c. SRD 
Zousan 0 aratte-iru yo, usisan ga. 
elephant ACC wash-PROG SFP cow NOM 
“(It) is washing the elephant, the cow.” 

d. ORD 
Zousan ga aratte-iru yo, usisan o. 
elephant NOM  wash-PROG SFP cow ACC 
“The elephant is washing (it), the cow." 

e. Scrambling 
Zousan o usisan ga aratte-iru ` yo. 
elephant ACC cow NOM wash-PROG SFP 
“The elephant, a cow is washing." 


Figure 1: Pictures presented with the test sentences in (5). 


We hid two animals because they are candidates for the focus. For example, in (5a), 
children first hear the accusative case-marked NP, zousan-o “elephant-ACC,” and 
they may think that the elephant is a patient if they are aware of the accusative 
case marker. At the end of the sentence, they hear the focused NP, usisan “cow.” This 
second NP is crucial for children to judge the test sentence. By listening to the end of 
the sentence, children must notice that the animal washing the elephant is not the 
cow but the dog. Therefore, by using three animals and two black boxes, we tried to 
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exclude the possibility that children produce correct answers only by hearing the 
first case-marked NP. There were four trials (two true and false conditions) for SCs, 
OCs, SRDs, ORDs, and scrambling. We used two verbs throughout this experiment, 
arau “wash” and oikakeru “chase.” Moreover, four declarative test sentences for 
practice and four scrambled test sentences were included. 


2.3 Results and discussion 
Table 2 shows the results of the experiment: 


Table 2: Correct Response Rates of Scrambling, Clefts, and Right Dislocations. 


Scrambling Subject Object Subject RD Object RD 

Clefts(SCs) Clefts(OCs) (SRDs) (ORDs) 
Word Order OSV ONS SVO ovs SVO 
Children 56.9% 52.8% 91.7% 86.1% 100% 
(JC: N=9, (41/72) (19/36) (33/36) (31/36) (36/36) 
JRD:N=9) 
Adults 100% 100% 100% 100% 100% 
(N= 11) (44/44) (44/44) (44/44) (44/44) (44/44) 


First, the children’s correct response rate for scrambling was 56.9%, which is 
around the chance level. As we mentioned at the beginning of Section 2, Otsu (1994) 
used the Act-out task and showed that children aged 3 and 4 responded almost per- 
fectly to scrambling when an appropriate discourse context showing the topic was 
given. In this experiment, however, given that two of three animals were covered 
with black boxes, the contexts we provided did not seem to help children compre- 
hend scrambled test sentences. 

Next, let us focus on the performance for SCs and OCs. The children’s correct 
response rate for SCs was 52.8%, and that for OCs was 91.7%. Clearly, children have 
difficulty comprehending SCs, where objects appear in the sentence-initial posi- 
tion. The OC performance was much better, probably because subjects, not objects, 
appeared at the beginning of OCs. When we consider the results of scrambling 
and SCs, we suggest that children are using the Agent-first Strategy to comprehend 
scrambling and SCs.° 


6 We suggested this possibility in Ohba, Sano, and Yamakoshi (2019). Intriguingly, Sano (2020) test- 
ed children’s comprehension of JCs with the first NP showing location with the particle -ni, such as 
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Now we focus on the results for SRDs and ORDs. The correct response rate for 
SRDs is 86.1%, and that for ORDs is 100%. Although the objects appeared at the 
beginning of SRDs, it is surprising that children’s performance for SRDs was much 
better than that for scrambling and SCs.’ These results suggest that children do not 
always use the Agent-first Strategy even when objects appear in a sentence-initial 
position.® Alternatively, the Agent-first Strategy may still be in effect in JRDs but 
another factor such as information structure, which we introduce below, may over- 
ride the effect of the Agent-first Strategy. 

Why are children good at JRDs but not SCs or scrambling? Altinok’s (2020) and 
Tomioka’s (2021) analyses of the information structure of JRDs may provide some 
insight. According to their analysis, JRDs have the following strategy: “communi- 
cate the essential part of the informational content of the utterance as early as pos- 
sible” (Tomioka 2021). They propose the following structure for JRDs: the dislocated 
element moves to the specifier of the Discourse Phrase, and the rest of the sen- 
tence undergoes the remnant movement from the complement of Disc? and adjoins 
above the dislocated element in Spec, Discourse P. Let us call this approach the 
Discourse Phrase Approach. 


Buta-ni not-teiru no wa dare kana? (Pig-LOC ride-PROG C TOP who Q) “Who is it that is riding on the 
pig?” As the children’s correct response rates were above 90%, he concluded that children do not 
use the Agent-first Strategy when the particle attached to the first NP is -ni. Therefore, the further 
issue is to determine when children (do not) use the Agent-first Strategy. 

7 One of the reviewers noted that the presence of a case marker in JRDs and its absence in JCs might 
have affected children’s performance. In (5a), we did not attach the nominative case marker with 
the subject in the focus position, as it is said to be unacceptable to many speakers, noted in footnote 
1. However, in (5b), a nominative case marker -ga is attached to the right-dislocated subject usisan 
“cow.” (5b) becomes awkward when the right-dislocated item is without the nominative case mark- 
er -ga. Generally, right-dislocated NPs do not sound well if case markers or postpositions are not at- 
tached. The presence or the absence of case markers may be one of the factors affecting children’s 
performance with JCs and JRDs. We would like to investigate this issue further in future research. 
8 One of the reviewers suggested that the difference between the results of SCs and SRDs in our 
experiment might indicate that the hypothesis of the Agent-first Strategy is not correct. Howev- 
er, effects of the Agent-first Strategy have long been reported in other constructions in Japanese, 
Korean, and Chinese such as scrambling (Hayashibe 1975), relative clauses (Suzuki 2011) and pas- 
sives (Huang et al. 2013 in Chinese; Deen et al. 2018 in Korean). Thus, we assume the presence 
of the Agent-first Strategy and its effect on children's comprehension of SCs in Japanese. 
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(6) 


(Altinok 2020: 10; Tomioka 2021) 


According to the proposed strategy of JRD (i.e., “Communicate the essential part of 
the informational content of the utterance as early as possible”) (Tomioka 2021), 
the essential part of the informational content of the utterance comes first in JRDs. 
Accordingly, JRDs may be much easier for children to comprehend than SCs and 
scrambled sentences. 

Altinok (2020) and Tomioka (2021) also note that a predicate is a core ingredient 
of JRDs. If their analysis is right, predicates, including subjects and objects, come at 
the beginning of JRDs, which may make them easier for children to comprehend. 
However, in SCs, the presuppositional clause at the beginning in clefts may not be 
the core ingredient of a sentence, and because the object appears at the beginning, 
it may be more challenging for children to comprehend SCs, unlike SRDs. Regarding 
scrambling, the predicate Oe, the verb) comes at the end of the sentence in Japa- 
nese; thus, it might be one of the causes of children’s challenges with scrambling.’ 

When we consider “the essential part of the informational content of the utter- 
ance” (Tomioka 2021), children’s processing of SCs and SRDs may be different. Some 
studies examine Japanese adults’ processing of JCs and JRDs (Kahraman et al. 2011; 
Yano, Tateyama, and Sakamoto 2015; Soshi and Hagiwara 2004),’° but to the best of 


9 Sugiura (2022) examines children’s comprehension of multiple JRDs, with VSO and VOS word 
orders. Sugiura finds that children were good at VSO and VOS, showing that they are good at JRDs 
in general. It is not clear how children interpret SO and OS orders in VSO or VOS well based on 
the Discourse Phrase approach. See Sugisaki et al. (2014) for the acquisition of VOS/VSO orders in 
Kaqchikel. 

10 Kahraman et al. (2011) examine the processing of SCs and OCs with adults using a self-paced 
reading task and shows that adults used more reading time for SCs than OCs. These results coincide 
with our children’s results. However, Yano, Tateyama, and Sakamoto (2015) present different re- 
sults in their ERP study. According to them, P600 amplitudes with OCs were larger than those with 
SCs, indicating that SCs are easier to process than OCs. It seems to be contradictory to Kahraman et 
al.’s (2011) results. Soshi and Hagiwara (2004) conduct an ERP study of JRDs with adults and show 
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our knowledge, no previous studies examine children’s processing of JCs and JRDs. 
Future research must investigate Japanese children’s processing of JCs and JRDs to 
explore the causes of the difference between SCs and SRDs. 


3 Children’s comprehension of scope interactions 
in Japanese clefts and Japanese right 
dislocations 


3.1 The (anti-)reconstruction properties of Japanese clefts 
and Japanese right dislocations 


In this section, we focus on another aspect in which children show different behav- 
iors between JCs and JRDs. Let us consider the scope interactions in JCs and JRDs. 
Notably, JCs exhibit a reconstruction property in sentences without negation, but 
they exhibit an anti-reconstruction property with negation (Hoji 1987; Mihara and 
Hiraiwa 2006; Nishigauchi and Fujii 2006; a.o.): 


(7) a. Taro ga A seme-ta no wa zibunzisin o da. 
Taro NOM blame-PAST C TOP himself ACC COP 
“It was himself that Taro blamed.” (Mihara and Hiraiwa 2006: 265) 
b. Hutari no Syoonen ga A utaw-ta no wa 
2-CL GEN boy NOM sing-PAST C TOP 
samba o 3-kyoku da. 
samba ACC 3-CL COP 


“It was three sambas that (the) two boys sang." 
(°X 322 -collective, ° 2>3=distributive) (Nishigauchi and Fujii 2006: 10) 


that the positivity effect of argument JRDs is P345, which is different from the study of JCs by Yano, 
Tateyama, and Sakamoto (2015). Soshi and Hagiwara (2004) also note that the positivity effect of 
argument JRDs was observed in the left frontal and temporal areas, which was not found with 
adjunct JRDs. They argue that this positivity effect is a syntactic integration process of dislocated 
arguments. Although the studies were conducted with adults, they may give us hints to understand 
children's comprehension of JCs and JRDs. 


234 —— Kyoko Yamakoshi and Hiroyuki Shimada 


(8) a. *Taro ga A tabe-nakat-ta no wa ringo-sika da. 
Taro NOM eat-NEG-PAST C TOP apple-FOC COP 
lit. It was only an apple that Taro ate. 
b. Taro ga A tabe-nakatta no wa  ringo zenbu da. 
Taro NOM eat-NEG-PAST C TOP apple all COP 
“Tt was all the apples that Taro didn't eat.” (k all>neg, *neg>all) 


In (7) are examples of JCs without negation. In (7a), zibunzisin “self” in the focus posi- 
tion can be bound and coindexed by Taro, the subject in the presuppositional clause. 
According to Mihara and Hiraiwa (2006), zibunzisin seems to be reconstructed in the 
object position, shown by A, in the presuppositional clause at LF. In (7b), Nishigauchi 
and Fujii (2006) suggest that the focused quantified object is reconstructed into the 
position of A in the presuppositional clause and is within the c-command domain at 
LF. The operation of reconstruction yields the distributive reading. 

There are various syntactic analyses of JCs, such as the null operator move- 
ment analysis by Matsuda (1997), Hoji and Ueyama (1998), and Kizu (2005), the 
string-vacuous verb movement, remnant movement and the operator movement 
analysis by Koizumi (1995), and the direct movement analysis by Hiraiwa and Ishi- 
hara (2012). Here we do not choose a particular analysis, but we assume a focused 
element is moved from its original position to the focused position in JCs. Thus, the 
reconstruction effect can be seen in (7). 

In (8) are examples of JCs with negation. In (8a), ringo-sika *nothing but the 
apple" cannot appear in the focus position. XP*sika is a Negative Polarity Item 
(NPI), and it must be c-commanded by negation internally (Kishimoto 2018: 7), 
but the ungrammaticality of (8a) shows that ringo-sika cannot be reconstructed 
in the canonical object position under negation in the presuppositional clause, 
which is called the anti-reconstruction property. Mihara and Hiraiwa (2006: 265) 
also present similar examples to show the anti-reconstruction property of JCs. 
In (8b), ringo zenbu “apple all” can appear in the focus position, but it cannot 
have a scope under negation, which also shows that it cannot be reconstructed 
under negation in the presuppositional clause. Thus, in (8b), the all>neg, not the 
neg>all, interpretation, is acceptable. (8a) and (8b) show the anti-reconstruction 
properties of JCs. 

Contrary to JCs, as in JRD (9a), the NPI LGB-sika *nothing but LGB" can be 
right-dislocated in a negative sentence. It shows that LGB-sika is reconstructed in 
the canonical position, as shown by A, and LGB-sika is c-commanded by negation 
in the position of A. 
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(9) a. Taro ga A yom-anak-atta yo,  LGB-sika. 
Taro NOM read-NEG-PAST SFP LGB-FOC 


*(lit.) Taro read only LGB." (Takita, 2011: 382) 
b. Taro ga A tabe-nakatta yo, ringo zenbu o 
Taro NOM eat-NEG-PAST SFP apple all ACC 


“Taro didn't eat, all the apples." Ch all»neg, °k neg>all) 


In (9b), all>neg and neg>all interpretations are possible, which means that ringo 
zenbu "apple all" can be reconstructed in the canonical position, and the negation 
can have wide scope over ringo zenbu, unlike the cleft sentence in (8b). There- 
fore, (9a) and (9b) show that JRDs allow for the reconstruction of right-dislocated 
elements. Let us call it the reconstruction property of JRDs. 

There are several syntactic approaches proposed for JRDs: for example, 
the rightward movement approaches by Haraguchi (1973), leftward movement 
approaches, such as the double preposing analysis by Kurogi (2006), and the 
bi-clausal repetition and deletion analysis by Kuno (1978), Whitman (2000), Abe 
(2004), Takita (2011), Yamashita (2011). Here we do not focus on a particular 
analysis; we assume the movement of the dislocated element is involved from its 
original position to the dislocated position. Thus, the reconstruction effects are 
observed in (9). 

In summary, although JCs and JRDs have similar non-canonical word orders, 
there is a difference in their reconstruction properties: in negative sentences, JCs 
have an anti-reconstruction property, and JRDs have a reconstruction property. The 
following table summarizes such properties. To our knowledge, no previous studies 
have examined children's knowledge of these properties. Therefore, in this section, 
we examine whether Japanese children are sensitive to the (anti-)reconstruction 
properties of JCs and JRDs. In the experiment, we focus on the cases with negation, 
as in Table 3. 


Table 3: Availability of reconstruction in JCs and JRDs. 
Reconstruction in JCs Reconstruction in JRDs 


withoutneg V d 
with neg is d 


236 —— Kyoko Yamakoshi and Hiroyuki Shimada 


3.2 Experiment 
The following is the research question and predictions for this experiment: !! 


(10) Research question and predictions 
Do Japanese children know of the anti-reconstruction property of JCs and the 
reconstruction property of JRDs? If so, Japanese children should reject neg>all 
readings in JCs but accept neg>all readings in JRDs when they interpret 
the scope interaction between negation and the universal quantifier. 


We tested 20 Japanese monolingual children (4;8 — 6;6) and 23 adult native speak- 
ers of Japanese using the truth value judgment task (Crain and Thornton 1998). 
We divided the subjects into two groups. In the JC group, there were 10 children 
(4;11 — 6;6, mean-5;8) and 12 adults, and we tested whether the children would 
reject neg>all readings and whether they would accept all>neg readings for JCs. 
In the JRD group, there were 10 children (4;8 — 6;6, mean-5;9) and 11 adults, 
and we tested whether they would accept neg>all and all>neg readings for JRDs. 

The test sentences in the main session were as follows." In the JC (JRD) group, 
we used two JCs (JRDs) with neg>all contexts and two JCs (JRDs) with all>neg 
contexts. The test sentences were recorded and given after short stories were pre- 
sented with pictures. Apart from the last picture, an anime character Anpanman 
appeared, and the test sentences were given as an explanation of a story uttered 
by Anpanman. The children were asked to judge whether the test sentence that 
Anpanman gave was true or false. 

The following is an example of the contexts for neg>all readings and the test 
sentences. The same stories were used for JC and JRD groups. (11) and (12) give the 
story, the last picture of the story, and the test sentences in the JC and JRD groups 
(see Figure 2). 


(11) (Context for the neg»all reading) There are a mouse and a dog. The teacher 
told them to have sweets and vegetables. The mouse should take a tomato 
and a piece of cake, and the dog should take a pudding and three green 
peppers. The mouse took a piece of cake but left the tomato because it did not 
like tomatoes. The dog took one pudding. The dog should have taken all the 
green peppers, but it did not want to take them because it did not like green 


11 This experiment was originally reported in Okada et al. (2019). 
12 The practice session involved two canonical sentences, two JCs or two JRDs, in each group with 
negation and one JC or JRD with zenbu “all” to examine whether the children knew the meanings 
of negation and zenbu. AU the subjects passed the practice session. 
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peppers. However, the dog remembered what the teacher said. Therefore, the 
dog took two of the three green peppers but left one. 


Figure 2: Last picture of the story (11). 


(12) Test sentences in JC group and JRD group 

a. JC (all»neg, *neg>all) 
Inu-san ga A tora-nakat-ta no wa 
dog NOM take-NEG-PAST C TOP 
piiman zenbu o da yo. 
green pepper all ACC COP SFP 
“It is all the green peppers that the dog didn't take." 

b. JRD (**all»neg, "neg»all) 
Inu-san ga A tora-nakatta yo, piiman zenbu o. 
dog NOM take-NEG-PAST SFP greenpepper all ACC 
“The dog didn’t take, all the green peppers.” 


Given that (12a) is a JC, the focus piiman zenbu-o “green pepper all-ACC” cannot be 
reconstructed in the presuppositional clause, and the negation cannot take wide 
scope over the quantifier zenbu “all.” Hence only the all>neg reading is allowed. We 
expected that children would reject (12a) if they only allowed the all>neg reading 
because the dog left one green pepper at the end of the story. 

(12b) is a JRD, and the right-dislocated object piiman zenbu-o “green pepper 
all-ACC" can be reconstructed to the canonical position A, which means all>neg and 
neg>all readings are allowed. We expected that children would accept (12b) if they 
allowed the neg>all reading because the dog took two green peppers but not three. 

Concerning the contexts for all>neg readings, for example, one animal took 
none of the vegetables. In JCs, the focused zenbu “all” cannot be reconstructed 
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under negation. Thus, children should have accepted all>neg readings if they knew 
the anti-reconstruction property of JCs. However, in JRDs, children should have 
accepted or rejected all>neg readings if they are aware of the reconstruction prop- 
erty of JRDs. 

3.3 Results and discussion 


Tables 4 and 5 show the results of the experiment. 


Table 4: Acceptance rates of the Japanese clefts group. 


Neg>all reading (*) All>neg reading (OK) 
Children 10.096 90.096 
(N=10) (2/20) (18/20) 
Adults 0.0% 100.0% 
(N=12) (0/24) (24/24) 


Table 5: Acceptance rates of the Japanese right dislocations group. 


Neg>all reading (OK) All>neg reading (OK) 
Children 60.096 10096 
(N=10) (12/20) (20/20) 
Adults 54.5% 95.5% 
(N=11) (12/22) (21/22) 


Table 4 shows the acceptance rates of the JC group. Children’s acceptance rate of 
all>neg readings was 90.0%, whereas that of neg>all readings was 10.0%. Thus, 
most of the children correctly assigned all>neg readings to JCs. Adults’ acceptance 
rate of all>neg readings was 100% and that of neg>all readings was 0%. Hence, 
adults rejected neg>all readings for all JCs, and children and adults comprehended 
JCs quite similarly. These results suggest that children are aware of the anti-recon- 
struction property of JCs.? 

Table 5 shows the acceptance rates of the JRD group. Children’s acceptance rate 
of all>neg readings was 100%, and that of neg>all readings was 60.0%. These results 
show that more than half of the children accepted the two readings of JRDs per the 


13 We presented similar observation concerning the scope assignment in JCs including focused 
elements without case markers in Shimada et al. (2019). 
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context. Adults’ acceptance rate of all>neg readings was 95.5%, and that of neg>all 
readings was 54.5%." The results show that children and adults behaved quite sim- 
ilarly, which suggests that children know the reconstruction property of JRDs. 

As for the children's differences between JCs and JRDs, the children mostly 
rejected neg>all readings in the JC group (1096 acceptance, which means 90% rejec- 
tion), but they accepted neg>all readings 60% of the time in the JRD group. This 
difference suggests that children are aware of the difference between JCs and JRDs 
concerning the scope interaction between negation and the universal quantifier 
zenbu “all.” 

Furthermore, the research group tested whether children are sensitive to the 
reconstruction property of JCs when they are without negation by using test sen- 
tences similar to (7b) in another experiment (see Shimada et al. 2019). We tested 14 
children (4;3 — 6;6, mean: 5;5). Given that the children allowed for the distributive 
readings 85.796 (24/28) of the time, the results show that children are sensitive to 
the reconstruction property of JCs without negation and the anti-reconstruction 
property of JCs with negation. 

In summary, although the word orders of JCs and JRDs are similar, the exper- 
iment shows that children differentiate JCs and JRDs when they interpret scope 
interactions. These results suggest that children know the anti-reconstruction 
property of JCs with negation, the reconstruction property of JCs without negation 
(Shimada et al. 2019), and the reconstruction property of JRDs regardless of the 
presence of negation. 

In Sections 2 and 3, we have shown that children treated JCs and JRDs differ- 
ently. In the next section, we will show that children make incorrect associations of 
focus particles in JCs and JRDs. 


14 Reviewers note children’s and adults’ low acceptance rates for neg>all readings. In the test sen- 
tences of our experiment, we used an accusative case marker —o with zenbu “all.” When the accu- 
sative case marker is attached to the quantifier zenbu "all" in negative sentences, both all>neg and 
neg>all readings should be available, but it seems that there is a preference for all>neg readings 
with the accusative case marker -o. This preference may stem from the presence of the contrastive 
topic marker -wain Japanese. When the contrastive marker -wa is attached to zenbu “all” instead of 
the accusative case marker o. only neg>all readings are acceptable (Kato 1985, McGloin 1987, a.o.) 
In Goro (2007), similar experimental results were given with the test sentences, including phrases 
such as omocha-o zenbu “toy-ACC all,” and children's acceptance rate of neg>all readings was 42.5% 
(p. 317). As Goro suggested, given the presence of the contrastive marker -wa, which yields only 
neg>all readings in negative sentences with zenbu “all,” adults and children in our experiment may 
have accepted all>neg readings more easily when the accusative case marker -o was attached to 
zenbu "all." Terunuma (2003) tested children's interpretation of zenbu-wa "all-TOP" with negation, 
and the children accepted neg>all readings more than 87% of the time, suggesting that children are 
sensitive to the difference between zenbu-o “all-ACC” and zenbu-wa “all-TOP.” 
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4 Children’s comprehension of focus particles in 
Japanese clefts and Japanese right dislocations 


This section addresses children’s incorrect associations of dake/sika “only” in JCs 
and JRDs. In English, Crain, Ni, and Conway (1994) and Notley et al. (2009) report 
on children’s incorrect association of only. When only modifies the subject NP as in 
“Only the cat is holding a flag” in test sentences, children often incorrectly associate 
the sentence-initial only as if it modifies the VP as in “The cat is only holding a flag.” 
However, when only is placed before VP in test sentences, it seems that children do 
not often incorrectly associate only with subject NPs. Hence, there is subject-ob- 
ject asymmetry in children’s wrong associations of only in English. Crain, Ni, and 
Conway (1994) and Notley et al. (2009) suggest that children interpret only as a 
sentential modifier as follows: [Only (the cat is holding a flag)]. Only c-commands 
and modifies all phrases in the rest of the sentence, and, thus, children misinterpret 
only attached to the subject as modifying the VP or the object. In Japanese, Endo 
(2004) examine Japanese children’s interpretations of the focus particles sika/dake 
“only” in canonical SOV sentences; she also observed subject-object asymmetry in 
Japanese. 
(13) shows examples of dake and sika in canonical SOV sentences: 


(13) a. Inusan-dake ga densya o kat-ta yo. 
dog-FOC NOM train ACC buy-PAST SFP 
“Only the dog bought a train.” 
b. Inusan-sika densya o kawa-nakat-ta yo. 
dog-FOC train ACC buy-NEG-PAST SFP 
“Only the dog bought a train.” 


In (13a), the focus particle dake, corresponding to “only,” is attached to the subject 
before the nominative case marker -ga. In (13b), sika, an NPI corresponding to 
*nothing but," is attached to the subject without a case marker. Unlike only in 
English, because dake and sika do not appear in a sentence-initial position, one 
might expect that Japanese children do not associate these focus particles incor- 
rectly. However, as noted, the behavior of children acquiring Japanese was quite 
similar to that of children acquiring English. (Endo, 2004).5 Crain, Ni, and Conway 


15 Further, Sano (2012) tests whether Japanese children incorrectly associate dake attached to 
subjects in scrambled sentences. He finds that the incorrect association is still observed when ob- 
jects are scrambled before subjects, which suggests that scrambled objects are reconstructed to the 
canonical position (SOV). 
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(1994) and Notley et al. (2009) suggest that children’s wrong associations stem from 
the position of only in syntactic hierarchical structures Oe, c-command relation), 
but it remains unclear whether the wrong associations are because of linear order 
or hierarchical structures. In the children’s structure they suggested [Only (the cat 
is holding a flag)], only can be associated with the first or second NP by focusing on 
its linear order, or only can be associated with the subject or object NP based on the 
hierarchical structure because only c-commands both. 

The experiments aim to ascertain whether Japanese children wrongly asso- 
ciate dake/sika in non-canonical word order sentences, such as JCs and JRDs, and 
whether subject-object asymmetry is observed. Given that the focus particles in 
Japanese, such as -dake and -sika, are attached to XPs, and they cannot stand alone, 
what we mean by wrong association is that the focus particle attached to an NP is 
somehow associated with another NP in test sentences. 

We can make different predictions for children’s wrong associations based 
on the surface linear order or the c-command relation in JCs and JRDs. Children 
may wrongly associate the focus particle with another NP based on surface linear 
order: for example, the focus particle attached to objects may be wrongly associ- 
ated with subjects in OVS sentences based on linear order because objects appear 
before subjects linearly. However, the focus particle attached to objects may not be 
wrongly associated with subjects if children’s wrong associations are based on the 
c-command relation between subjects and objects of the canonical word order, SOV. 
Regarding the c-command relation between subjects and objects of the canonical 
word order, we assume focused items in JCs and right-dislocated items in JRDs are 
reconstructed to the original positions of the canonical word order hierarchically. 
The experimental results suggest that children incorrectly associate focus particles 
based on the c-command relation of the canonical word order, SOV, and that chil- 
dren reconstruct the focused items in JCs and the right-dislocated items in JRDs to 
the original positions of the canonical word order.!5 The next subsections will detail 
the experiment and results. 


16 From the results in Section 3, Japanese children know the anti-reconstruction property of JCs 
with negation, the reconstruction property of JCs without negation (Shimada et al. 2019), and the 
reconstruction property of JRDs regardless of the presence of negation. Test sentences in Section 
4 do not include negation; thus, children can reconstruct dislocated items to original positions in 
JCs and JRDs. 
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4.1 Experiment 
The research question for this experiment is as follows." 


(14) Research question 
Do Japanese children incorrectly associate focus particles in non-canonical 
word order sentences, such as JCs and JRDs? If so, is it based on linear order 
or hierarchical structure (i.e., the c-command relation between subjects and 
objects in the canonical word order after reconstruction)? 


Table 6 shows the types of test sentences we used and the predictions based on 
surface linear order or hierarchical structure. 


Table 6: Predictions of the experiment. 


Test sentence types Surface linear Reconstructed Hierarchical structure 
order canonical structure after reconstruction 
(i) canonical: [Focused S] OV ` (A)Yes = (B) Yes 
(ii) canonical: S [Focused OJ V (C) No = (D) No 
(iii) JC/JRD: [Focused S] V) O (E) Yes [Focused S] O V (F) Yes 
(iv) JC/JRD: [Focused O] Vj) S — (G) Yes S [Focused 0] V (H) No 
(v) JC/JRD: O V(,) [Focused S] (I) No [Focused S] O V (J) Yes 
(vi) JC/JJRD: S V(,) [Focused O] (K) No S [Focused 0] V (L) No 


[Focused X] means that the focus particle dake or sika is attached to the subject 
(S) or object (O). *Yes" (*No") shows that children's wrong associations are (not) 
predicted. 

As mentioned in Section 3 (also in footnote 1), movements are involved in JCs 
and JRDs. If wrong associations occur based on surface linear order, they should 
occur based on the linear order after movements. However, if wrong associations 
occur based on reconstructed structures, wrong associations should occur based on 
the hierarchical structure of the canonical word order after reconstruction. 

Type (i) ([Focused S] OV) and Type (ii) (S [Focused O] V) are canonical word 
order sentences. As noted, prior studies, such as Endo (2004), have shown that 
wrong associations occur frequently in (i), not (ii), and we predicted that it would 
be the same in cells (A), (B), (C), and (D). 


17 Some parts ofthis experiment were originally reported in Mochizuki, Shimada, and Yamakoshi 
(2021) and Shimada, Mochizuki, and Yamakoshi (2022). 
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In Type (iii) ([Focused S] V(,) O), the focus particle is attached to the subject. 
The surface linear order is SVO, and wrong association with the object is predicted 
in cell (E) because the subject appears before the object based on the surface linear 
order. Next, the word order after reconstruction becomes SOV, and the subject 
c-commands the object. Thus, the wrong association with the object is also pre- 
dicted in cell (F) since the subject c-commands the object in the reconstructed 
structure. 

In Type (iv) ([Focused O] V() S), the object with the focus particle appears in 
the sentence-initial position. Thus, wrong association with the subject at the end 
is predicted in cell (G) based on the surface linear order OVS. However, if children 
reconstruct the sentence-initial object to its canonical position based on its hier- 
archical structure, the word order becomes SOV, and wrong association with the 
subject is not predicted in cell (H) since the object is c-commanded by the subject. 

In Type (v) (O V(,) [Focused S]), on the surface linear order OVS, the subject 
with the focus particle is at the end of the sentence; thus, wrong association with 
the object is not predicted in cell (D. However, if children reconstruct the subject to 
the canonical position, the word order becomes SOV, and the subject c-commands 
the object. Hence wrong association is predicted in cell (J). 

In Type (vi) (S VC) [Focused 0]), the focus particle is attached to the object at 
the end of the sentence. On the surface linear order SVO, wrong association with 
the subject is not predicted in cell (K) because the object appears after the subject 
linearly. If wrong associations occur based on the reconstructed structure, the word 
order after reconstruction becomes SOV. Wrong association with the subject is also 
not predicted in cell (L), as the object does not c-command the subject in the canon- 
ical word order. 

In summary, as highlighted in gray in Table 6, predictions differ. If children's 
incorrect associations of the focus particles stem from surface linear order, wrong 
associations are predicted to occur in Types (iii) and (iv). However, if children's 
incorrect associations stem from hierarchical structures after reconstruction, then 
wrong associations are predicted to occur in Types (iii) and (v). 

We conducted the JC and JRD experiments at different times, but the same 
stories and pictures were used. In the JC experiment, the subjects were 10 Japa- 
nese monolingual children (5;7 — 6;5, mean: 6:0). We used the focus particle dake 
exclusively because sika cannot appear in the focused position in JCs, as in Section 
3. In the JRD experiment, the subjects were 16 Japanese monolingual children (5;2 
— 6;10). We tested dake and sika for the JRD experiment and divided the children 
into two groups: dake (5;7 — 6;7, mean: 6;3) and sika (5;2 — 6;10, mean: 6;2) groups. 
The method was the truth value judgment task (Crain and Thornton 1998). Short 
stories were presented by an experimenter with pictures on a computer screen. 
After each story, an anime character, Anpanman, appeared beside the picture and a 
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recorded test sentence was given as Anpanman’s description of the story. Children 
were asked to judge whether a test sentence by Anpanman was true or false. From 
Table 6, the experiment included two canonical word order sentences and four 
types of target sentences. Each type comprised two trials, one with a matched and 
one with a mismatched condition. We also included a practice session and four 
filler items during the main session. 

Consider examples of the test sentences for Type (ii) ([Focused O] V(,) S) and 
the picture below: 19 


Figure 3: Picture presented with the test sentences in (15). 


(15) (Context) 
Experimenter: The squirrel took a carrot and a pepper. Now Anpanman will 
talk about the frog. (This was originally given in Japanese and translated into 
English) 
(Test sentences) 
a. JC(with dake) 
Ninjin-dake o tot-ta no wa  kaerusan da yo. 
carrot-FOC ACC take-PAST C TOP frog COP SFP 
“It was a frog who took only the carrot." 


18 In the JC experiment in Section 4, we conducted the experiment separately from other experi- 
ments and did not attach the nominative case marker or accusative case marker with the focused 
NP. It was desirable if the accusative marker was attached to the object NPs in the focus position, as 
in Section 2 and 3. We suppose the absence or presence of the accusative case marker here did not 
change our results very much. Furthermore, Sugawara (2016) conducted experiments to test chil- 
dren's comprehension of only in English using Question-Answer Congruence (QAC) (Rooth 1985, 
von Stechow 1990.) In this experiment, we could not include discourse contexts based on QAC, but 
we aim to probe contexts considering QAC for a future research. 
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b. JRD (with dake) 
Ninjin-dake o tot-ta yo,  kaerusan ga. 
carrot-FOC ACC  take-PAST SFP frog NOM 
“(She/he) took only the carrot, the frog." 

c. JRD (with sika) 
Ninjin-sika —tora-nakat-ta yo,  kaerusan ga. 
carrot-FOC take-NEG-PAST SFP frog NOM 
*(She/he) took only the carrot, the frog." 


In JC (15a), dake is attached to the object at the beginning of the JC. If a child com- 
prehends it correctly, this test sentence is true because the frog only took a carrot 
in the story. As dake is attached to the first NP ninjin “carrot,” children can wrongly 
associate dake with the second NP kaerusan “frog” based on the surface linear 
order. If children's wrong associations are based on the hierarchical structure 
after reconstruction, the word order after reconstruction is S O-dake V, and chil- 
dren should not associate dake with the subject NP kaerusan “frog” because the 
subject is structurally higher than the subject and the object*dake does not c-com- 
mand the subject. In JRDs (15b, c) with dake and sika, the focus particle is attached 
to the object ninjin *carrot" and the same prediction as in (15a) would apply. 


4.2 Results and discussion 
The results of the experiment are shown in Table 7: 


Table 7: Children's correct response rates in JCs and JRDs. 


JCs JRDs 
dake (N = 10) dake (N = 8) sika (N = 8) 
(i) [Focused S] O V 45.0% (9/20) 56.3% (9/16) 50.0% (8/16) 
(ii) S [Focused O] V 90.0% (18/20) 87.5% (14/16) 81.3% (13/16) 


(iii) [Focused S] V(,) O 30.0% (6/20) 31.3% (5/16) 25.0% (4/16) 
(iv) [Focused O] V(,) S 95.0% (19/20) 81.3% (13/16) 93.8% (15/16) 
(v) O V(,) [Focused S] 55.0% (11/20) 25.0% (4/16) 31.3% (5/16) 
(vi) S V(,) [Focused 0] 90.096 (18/20) 81.396 (13/16) 100% (16/16) 


As for the canonical word order sentence types (i) and (ii), the results were quite 
similar to those of Endo (2004). The correct response rates for Type (i) highlighted in 
gray are not high because children wrongly associated the focus particle attached 
to subjects with objects. However, the correct response rates for Type (ii) are high, 
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above 80%, because children did not wrongly associate the focus particle attached 
to objects with subjects. Namely, subject-object asymmetry is observed in (i) and 
(ii), as reported at the beginning of this section. 

In the JC and JRD experiments, the correct response rates for Types (iii) and (v) 
highlighted in gray were quite low, mostly around 30%, whereas those for Types 
(iv) and (vi) were quite high, above 80%. The children showed the same tendencies 
in JCs and JRDs, and there seemed not to be much difference between dake and 
sika. These results show that children make wrong associations frequently when 
the focus particle is attached to subjects but less often when attached to objects, 
regardless of the positions of subjects and objects on surface linear order. The 
results clearly show that there is subject-object asymmetry in children’s wrong 
association of focus particles in JCs and JRDs. 

As presented in Table 6, our prediction was as follows: if children’s wrong asso- 
ciations are based on surface linear order, the correct response rates for Types (iii) 
and (iv) were expected to be low, and those for Types (v) and (vi) were expected 
to be high in the JC and JRD experiments; however, that was not the case. The chil- 
dren responded to Types (iii) and (v) with low correct response rates but Types (iv) 
and (vi) with high rates. These results suggest that the children in the JC and JRD 
experiments incorrectly associated the focus particles based on the reconstructed 
hierarchical structures. 

Notably, these results do not contradict the results for scope interactions in 
Section 3. In Section 3, children are sensitive to the reconstruction property of JCs 
without negation and the anti-reconstruction property of JCs with negation. Fur- 
thermore, children are sensitive to the reconstruction property of JRDs regardless 
of the presence of negation. Given that the test sentences in this section are without 
negation, the results suggest that the children reconstructed focused and right-dis- 
located items to canonical positions in JCs and JRDs and that children’s incorrect 
associations of the focus particles dake and sika stem from the hierarchical struc- 
tures after the reconstruction of subjects or objects in JCs and JRDs. 


5 General discussion and conclusion 


This study focused on three aspects of the acquisition of JCs and JRDs: word order, 
scope interactions, and association of the focus particles. From the results of the 
experiments, children seem to be sensitive to differences between JCs and JRDs 
concerning word order and scope interactions, while they show similar behaviors 
for the wrong associations of focus particles. Section 2 examined whether children 
use the Agent-first Strategy (Bever 1970, Hayashibe 1975) when they interpret JCs 
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and JRDs. The results show that children have difficulty with SCs, and we suggested 
that children apply the Agent-first Strategy for SCs. However, children did not have 
problems with SRDs. If children also use the Agent-first Strategy for SRDs, SCs and 
SRDs should be problematic for them, but the results revealed that the children had 
troubles with SCs but not with OCs, SRDs, or ORDs. 

Thus, to probe the reason for this difference, we suggested that the Discourse 
Phrase Approach (Altinok 2020, Tomioka 2021) may provide insight: the sen- 
tence-initial part in JRDs is an “essential part of the informational content of the 
utterance" (Tomioka 2021). However, the sentence-initial part in JCs is a presup- 
positional clause; thus, it may be easier for children to comprehend JRDs than JCs, 
particularly SCs. As noted, the Agent-first Strategy may still be at work in JRDs, but 
the importance of the sentence-initial part in its information structure of JRDs sug- 
gested by Tomioka (2021) may override the effect of the Agent-first Strategy. Hence, 
children's performance for JRDs is quite good relative to that for JCs. Another factor 
with sentence processing may be also related: In JCs, the sentence does not end 
after the presuppositional clause. It is followed by the complementizer no, the topic 
marker wa, the focus element, and the copula da. In JRDs, however, the sentence 
is suspended with the sentence-final particle yo and a pause showed by a comma 
before the right-dislocated item. This difference between JCs and JRDs may explain 
the difference in children's performance between JCs and JRDs, and childrer's pro- 
cessing of JCs and JRDs requires further investigation. In summary, despite the sim- 
ilarities between JCs and JRDs concerning their word order, this study has shown 
that children clearly distinguish JCs and JRDs. 

Section 3 provided another piece of evidence showing that children are aware 
of the difference between JCs and JRDs: the anti-reconstruction property of JCs and 
the reconstruction property of JRDs. We tested the scope interaction between nega- 
tion and an NP modified by the universal quantifier zenbu *all." The results show 
that children know the (anti-)reconstruction properties of JCs and JRDs, and chil- 
dren know the differences between JCs and JRDs. 

In Section 4, we examined children’s incorrect associations of the focus parti- 
cles dake/sika *only/nothing but" in JCs and JRDs. In JCs and JRDs, children wrongly 
associated dake and sika frequently when the particles were attached to subjects 
but not objects, and subject-object asymmetry was observed. Children are prob- 
ably aware of the difference in the syntactic structures between JCs and JRDs, 
and they know how to reconstruct those into the structures with canonical word 
order because their incorrect associations of the focus particles seem to occur after 
reconstruction. We must further investigate how children capture the syntactic 
structures of JCs and JRDs and why children's incorrect associations of focus par- 
ticles occur by examining other various aspects of the acquisition of JCs and JRDs. 
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Abbreviations 

ACC accusative 

cL classifier 

COP copula 

FOC focus 

Jc Japanese cleft 

JRD Japanese right dislocation 
LOC locative 

NEG negation 

NOM nominative 

OC object cleft 

ORD object right dislocation 
PROG progressive 

SFP sentence-final particle 
SC subject cleft 

SRD subject right dislocation 
TOP topic 
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Yuki Hirose and Reiko Mazuka 


Chapter 13 

Developmental changes in the 
interpretation of an ambiguous structure 
and an ambiguous prosodic cue in Japanese 


1 Introduction 


This chapter investigates whether adults and children abide by the same process- 
ing bias when encountering a global structural ambiguity and whether they share 
a common understanding of what certain prosodic phenomena signal in resolving 
the ambiguity. The first question is whether young children exhibit an adult-like 
processing bias (a local interpretation of a modifier), which supposedly results 
from an advantage in incremental processing. It is worth investigating because 
young children may not be as efficient as adults in processing continuous input as 
rapidly as it is received. The second question concerns how prosodic information is 
used by children, particularly in the case of a prosodic signal potentially associated 
with two different roles, namely, as a signal to syntactic structure and as a signal 
indicating contrastive status. We consider one instance of branching ambiguity in 
Japanese. 


1.1 Branching ambiguity in Japanese 


Studies on adults' online processing of three-part noun phrases with a branch- 
ing ambiguity in Japanese, such as (1) report an overall preference for the inter- 
pretation with the left-branching (LB) structure over the interpretation with 
the right-branching (RB) structure (Ito, Arai, and Hirose 2015; Hirose 2020). 


(1 aoi neko-no ka'sa-wa doko 

blue cat-Gen  umbrella-Top where 

a. Left-branching (LB) 
*Where's the umbrella with blue cats on it?" 
[[aoi  neko]mo ka'sa]-wa doko? 

b. Right-branching (RB) 
*Where's the blue umbrella with cats on it?" 
[aoi [ne'ko-no ka'sa]-wa doko? 


[o] Open Access. © 2024 the author(s), published by De Gruyter. | C9 TEXTE This work is licensed under the Creative 
Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. 
https://doi.org/10.1515/9783110778939-013 
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This preference is expected from a processing perspective. That is, human sentence 
processing operates incrementally, continuously assigning syntactic structure to 
input that unfolds over time; head-final languages, such as Japanese are no excep- 
tion (Inoue 1991; Inoue and Fodor 1995; among many others). In the example above 
(modifier + N1Gen + N2), the only available head for the modifier (ao’i “blue”) at the 
point where N1 is processed is that very N1 (eko “cat(s)”), thus determining the 
node dominating ao'i * ne'ko, which will lead the entire NP to the LB configuration 
Oe, that blue-colored cats are on an umbrella). For the RB structure to be assigned 
to the NP, the association between the modifier and its head would have to be post- 
poned until the final or third element (ka'sa *umbrella") is processed, overriding the 
pressure for an immediate association between the modifier and its modificand. The 
RB structure is treated as a marked interpretation at the phonology-syntax inter- 
face (Kubozono 1988) in that it requires distinct prosodic marking (such as metrical 
boost), as will be discussed in-depth in the following section. Ifthe overall preference 
for the LB over the RB structure mainly stems from the human parser's incremental 
nature, the branching bias may be different in populations where comprehenders 
may not comply with the pressure for immediate or incremental processing as 
rapidly as adults (Snedeker and Yuan 2008; Ito et al. 2014; Hirose and Mazuka 2017). 

One reason to assume RB structure can be more accessible for children is that 
children tend to resort to the intersective interpretation reported by Matthei (1982). 
In that study, children misinterpreted phrases such as the second green ball, choos- 
ing a green ballin the second position in an array (where the first ball was in some 
other color, thus making the green ball the first green ball). Linguists often expect 
that children's preferences in interpreting syntactically ambiguous word strings 
can reveal their grammatical knowledge. Matthei (1982) accounted for the so-called 
intersective reading as revealing the lack of an intermediate node dominating [green 
ball]. Instead of constructing such a hierarchical structure for the noun phrase the 
second green ball, the children adopted a non-hierarchical structure in which both 
second and green are assigned positions on the same level as that of the head noun 
ball. This explanation accords with Pérez-Leroux et al. (2018), where children (4-6 
years of age) had more difficulty producing NPs requiring recursive modification 
than NPs calling for sequential modification without recursive embedding. 

An alternative account of such phenomena, proposed by Hamburger and 
Crain (1984), posits that the apparent inability to compute a hierarchical structure 
reported in Matthei (1982) stems from children's performance errors. This account 
can accord with the idea that children's interpretive bias stems from the ease of 
computing the structure for the intersective reading relative to computing the 
structure for the correct reading. 

If we could set aside the advantage of LB structure in incremental processing, 
RB structure could be generated, and perhaps processed, with more ease. Fujita 
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(2017) distinguished the syntactic operations by which (1a) and (1b) are generated 
as Pot-Merge and Sub-Merge, respectively, as an analogy to concepts proposed in 
action grammar (Greenfield 1991), generalizing the strategies by which non-lin- 
guistic actions are controlled by children. In Pot- and Sub-Merge, there are two 
steps of Merge. In Pot-Merge, the target of both Merge operations is fixed; thus, both 
the intermediate node and the top node would have the same label. In Sub-Merge, 
the first Merge step targets the head of the intermediate node, which then merges 
with another head, the head of the entire phrase, resulting in a structure in which 
the intermediate node and the top node have different labels. Fujita argued that 
Sub-Merge is more taxing on human working memory, as it requires a sub-unit to 
be stored in the process. The difference between these Merge types is not automat- 
ically linked to the branching directions, but when applied to the two alternative 
structures in question [i.e., (1)], the RB structure would be generated through Pot- 
Merge while the LB structure would have to undergo Sub-Merge. The difference 
then would explain the higher cost of generating and perhaps processing the LB 
structure, independent of its incremental advantage, which children may not enjoy 
to the same extent as adults. This is the first question we will set out to answer: 
Do young children have a processing bias in branching ambiguities? Is it different 
from the LB bias of adults? 


1.2 The role of pitch prominence in resolving branching 
ambiguities 


Pitch prominence in Japanese has multiple functions, allowing certain prosodic 
signals to induce more than one way of interpreting an utterance at different lin- 
guistic levels. First, the Japanese pitch accent is directly linked to the lexical accent. 
Whether a Japanese lexical word is accented or unaccented is specified in the 
lexicon. Ota, Yamane, and Mazuka (2018) demonstrated that the ability to use pitch 
information to recognize words develops after 17 months. 

Pitch prominence also plays a role in projecting the syntactic structure. Studies 
show that pitch prominence contributes to distinguishing two noun phrases com- 
prising three elements configured differently (Kubozono 1988; Ito, Arai, and Hirose 
2015; Hirose 2020). The LB structure can provide a downstepping domain through- 
out the entire NP. Downstep is a phonological phenomenon where an accented word 
lowers the H peak of the following word within the downstep domain. In (1a), the 
downstep should occur over the three words ao'i, ne'ko-no, and ka’sa, which are all 
accented, producing a staircase-like f0 contour. Thus, the H peak of ne'ko in ao'i ne'ko 
(*blue cat") in (1a) is realized lower than the same word in, for example, akai ne'ko 
(“red cat”), which is not preceded by an accented word (akai is an unaccented word). 
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The realization of downstep is affected by the syntactic structure behind the 
three words (Kubozono 1988; Selkirk and Tateishi 1991). The downstep that would 
occur on the second word (ne’ko [“cat”], as in [1b]) is interfered with at the left edge 
of an RB structure. The second word, ne’ko, is realized with an elevated pitch, which 
appears to cancel out the declination of the f0 peak because of a downstep. The 
nature of such interference is under debate. One interpretation is that the left edge 
of the RB structure is associated with a prosodic boundary, thereby resetting the 
domain of downstep (Pierrehumbert and Beckman 1988; Selkirk and Tateishi 1991). 
An alternative interpretation maintains that the downstep continues over the three 
words but the f0 of the H peak is realized higher at the left edge of an RB structure, 
still within the same prosodic phrase; it is called metrical boost (Kubozono 1988). 
Both accounts presuppose that the prosodic difference associated with the two 
branching structures is derived from the grammatical knowledge controlling the 
correspondence between syntax and prosody. 

There is yet another level of information that pitch accent is responsible for 
encoding in speech. In Japanese, like many other languages, pitch prominence can 
express discourse and information status conveyed by focus (Ito et al. 2012; Jincho, 
Oishi, and Mazuka 2019). Focus prosody can emphasize new information relative 
to old or given information, the target of a question expressed with a wh-phrase, 
and contrastive information. Focus prosody comprises a notable pitch elevation 
on the focused element, followed by compression of the pitch range continuing up 
to the end of the focus domain (i.e., post-focal reduction; see Ishihara 2011). 

In perception, to understand whether the input being processed is in linguistic 
focus, listeners must recognize the pitch elevation on the focused element and the 
post-focal reduction that follows to decide on the domain of the focus. However, evi- 
dence from sentence processing shows that listeners rely mainly on pitch elevation 
on the focused item alone (Kitagawa and Hirose 2012). It accords with the human 
parser’s real-time decision-making about the linguistic status of incoming input 
being incremental in nature; the parser makes interpretations without waiting for 
the confirming information that may be provided by later input, such as, in this 
case, post-focal reduction. 


1.3 Resolving the ambiguity in pitch prominence to resolve 
branching ambiguities 


The discussion thus far establishes the possibility that the same acoustic event, 
namely an increase in the peak f0 height of an accented word, could stem from 
the prosodic phenomena driven by the syntactic structure (whether metrical boost 
or reset, associated with the RB syntax) or a focus event (the word is pronounced 
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with focus prominence). The subsequent input would provide additional informa- 
tion to decide which factor is responsible for the rise. For example, if ao'i ne'ko-no 
ka'sa (“blue cat-cpy umbrella”) is realized with a notable pitch rise on ne'ko (“cat”), 
instead of downstep continuing on all three items, it could be perceived as evidence 
for the RB syntax. The later input would have to accord with this interpretation; 
that is, the rest of the input would have to accord with an RB structure comprising 
three elements, not four (because a sequence of four elements would be subject to 
rhythmic boost (Kubozono 1989) and an f0 raising effect that can result in reorgan- 
izing a four-word syntactic structure into two prosodic minor phrases, overriding 
the f0 contour expected by the application of metrical boost). If, however, a listener 
is to guess which interpretation the incoming input ao'i ne'ko-no would be associ- 
ated with, the pitch rise on ne'ko would be a useful clue to expect (1b), with the RB 
structure. 

The situation becomes more complicated if one is choosing from the two pos- 
sible alternative structures, LB and RB, in a situation where context provides room 
for contrastive interpretations (Ito, Arai, and Hirose 2015). For example, the pitch 
prominence on ne’ko could stem from contrastive focus to emphasize the infor- 
mation ne'ko as opposed to possible alternatives (e.g., some other animal that is 
also blue). Again, the elevated pitch alone would not be sufficient to determine the 
words focus status without evidence for the occurrence of post-focus pitch com- 
pression on the subsequent input. If the background information (i.e., preceding 
context or the situation in which the expression is heard) does not provide any 
context in which ne'ko could stand in a contrastive relationship with some other 
entity, there may be no motivation to assign the focus interpretation to the pro- 
sodic event. However, if the contrastive context is properly induced (e.g., by the 
presence of another entity that is also blue), listeners may be anticipating that the 
appropriate part of the input will receive focus prosody. 

Hirose (2020) conducted a series of visual world paradigm eye-tracking experi- 
ments using Japanese sentences such as (1a, b), one with downstep throughout the 
NB the other with a pitch rise on the second word (e.g., ne'ko), to investigate how 
such ambiguous prosodic information (namely, a pitch rise on the second item of an 
NP consisting of three accented items) is interpreted in real-time processing. In one 
of the experiments using pictures, such as the one presented in Figure 1, where the 
contrastive interpretation of the pitch rise is felicitous in the visual context (given 
the presence of blue squirrels on an umbrella), the LB target attracted more looks 
relative to the RB target as soon as the second word in question became available. 

This outcome was considered evidence that the pitch rise information induced 
a contrastive interpretation (e.g., it is the cat that is blue, not the squirrel) based on 
the input available at that point, which accords more with the LB interpretation of 
the entire set of experimental materials (because there was only one visual object 
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Figure 1: Example visual scene for the ambiguous item exemplified in (1). The color used for the 
four umbrellas (whether applied to the background of the umbrellas or the animal illustrations 
decorating the umbrellas) was the same (blue in this example). 


with a blue cat on it). Backing up this explanation, the study also reported that 
the same pattern was not observed when the picture did not have possible com- 
petitors appropriate for the contrastive interpretation. 

Interestingly, toward the end of such a sentence, when the listeners had to 
finalize their interpretation to perform the task (clicking on the most appropriate 
picture), the sentences that had the pitch rise on the second word (W2) induced 
more looks to the RB target relative to the LB target. By the time the entire input was 
received, and it was evident that the input involved a three-part NP, which means 
the condition for the metrical boost was provided, the parser had re-interpreted 
the pitch phenomenon as a signal to the RB syntax. This finding shows that adult 
Japanese listeners can efficiently and flexibly use the same prosodic cue differently 
at different times during the processing of a sentence, making the prosodic signal 
useful regarding whatever partial information is available at each point during the 
process. In this study, however, the identical acoustic signals resulted in different 
interpretations per the visual context alone. Importantly, the results also accorded 
with the view that syntax-marking prosodic cues are understood independently 
from context or situation (Speer, Warren, and Schafer 2011), as the W2 rise even- 
tually induced an increase in the RB interpretation regardless of the contextual 
manipulation. 
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1.4 How do children cope with prosodic ambiguity 
and syntactic ambiguity? 


It is worth further investigating whether young children's interpretations follow 
the same patterns as adults' interpretations. Children's behavior in such contexts 
is relevant to multiple questions. First, when do children learn to process complex 
phrases comprising three elements with a branching ambiguity? 

Several studies in different languages report varying degrees of sensitivity of 
children to prosodic information that marks phrase and clause boundaries (Choi 
and Mazuka 2003; Carvalho, Dautriche, and Christophe 2016; Carvalho et al. 2016; 
Snedeker and Yuan 2008). Regarding pitch accents in Japanese, Ito et al. (2012) show 
that six-year-olds are sensitive to pitch accents in real-time contrast resolution in 
color Adjective * Noun (where the color information is contrastive) in Japanese. 
Jincho, Oishi, and Mazuka (2019) further test six- and five-year-olds using similar 
visual materials to Ito et al. (2012), except that the manipulations in contrastive 
status were realized in the visual context rather than prosody. They observed a facil- 
itation effect ofthe contrastive information in six-year-olds but not in five-year-olds. 

Young children's sensitivity to pitch information regarding the resolution of syn- 
tactic branching ambiguities has yet to be documented. If the Japanese children have 
acquired the association between a certain pattern of the F0 contour and the RB 
syntax and that with contrastive focus, there remains the question of whether adults 
and children always share a common understanding of what certain prosodic phe- 
nomena signal in a given context. Children should have learned that the RB structure 
is somewhat marked and, therefore, must be prosodically signaled by a rising pitch 
on the edge of the right-branching node (whether that pitch change is placing a pro- 
sodic boundary or applying a metrical boost). It further presupposes that children 
can construct and process LB and RB syntactic structures to recognize the difference, 
which requires the ability to handle the hierarchical configurations of the phrase 
structures in question. Moreover, children would have to be sensitive to how con- 
trastive status is reflected by linguistic and prosodic features. They would need to 
know how contrastive focus is encoded in the prosodic representation, which corre- 
sponds to the syntactic representation. If either knowledge type (how RB structures 
and contrastive focus are encoded in prosody) remains unavailable in real-time com- 
prehension, we will not see the adult-like patterns reported above in children's data. 

Whether the speed and efficiency with which children apply the relevant 
knowledge in real-time processing differ from that of adults is another issue. In 
general, visual world paradigm research shows that children's eye responses to lin- 
guistic information are subject to delay by several hundred milliseconds relative 
to adults' (Trueswell et al. 1999; Arnold 2008; Snedeker and Yuan 2008). Regarding 
the real-time use of pitch accent information, studies report that elementary school 
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children (six to 11 years of age) are subject to a 400 ms delay (Ito et al. 2014; Hirose 
and Mazuka 2017). 

Beyond timing differences, children at age six demonstrate the ability to utilize 
pitch accent information when processing contrastive status in an online task, but 
they need more time than adults to redirect their attention between tasks to cor- 
rectly interpret the contrastive prosody (Ito et al. 2012). If children’s online response 
to contrastive pitch accent is subject to a sizable delay in the same task as that used 
in Hirose (2020), the immediate effect of W2 pitch rise (facilitating the LB inter- 
pretation via the contrastive interpretation of the pitch rise) may not be observed 
early enough, even if the effect is present. Alternatively, by the time the W2 pitch 
rise is recognized and processed for its function, subsequent information induc- 
ing the alternative interpretation of the pitch rise information perceived earlier 
(e.g., as a signal to the RB syntax) may already have become available, canceling 
out the contrastive interpretation of the pitch rise. If the contrastive interpretation 
is not computed quickly, the bias toward the LB target may not occur, as the asso- 
ciation of the contrastive interpretation of the pitch rise on the first N (e.g., ne’ko) 
and the LB interpretation presupposes the immediate and very local computation 
of the Adjective + N, without considering the alternative branching structure that 
becomes available when the subsequent N (e.g., ka’sa) obtains it. This outcome is 
most likely for young children. 

Our main goal is to investigate the period within which the branching structure 
is determined and whether and how the prosodic information plays a role for young 
children. We compared two groups of three- and four-year-old children. 


2 Experiment 


A visual world paradigm experiment, using nearly identical stimuli to Hirose (2020), 
was conducted with native Japanese-speaking children. 


2.1 Method 
2.1.1 Participants 


The child participants comprised 28 three-year-olds and 25 four-year-olds, recruited 
at Riken Brain Science Institute; their parents received a small payment. All the 
children resided in the vicinity of Wako-city, Saitama, and their parents were all 
speakers of Tokyo Japanese. 
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2.1.2 Materials 


2.1.2.1 Linguistic stimuli 

The experimental sentences were the same as in Hirose (2020). The critical mate- 
rials comprised 12 experimental audio items of the form: [color word] + N1-Gen + 
N2-Top, followed by wa dore (*which one"), as exemplified in (1), repeated here. 


(2 aoi neko-no ka'sa-wa do're? 
blue cat-Gen  umbrella-Top which 
Which one is the umbrella with blue cats / the blue umbrella with cats?’ 
(Note: Japanese does not have a number distinction; ne'ko is compatible 
with both a single cat or multiple cats.) 


All three constituents of the NP were lexically accented words (hence subject to 
downstep if they form an LB structure). There were also 12 filler audio items with 
no branching ambiguity or particular pitch emphasis, which were also included in 
the experiment. Some of the fillers mentioned the color or pattern of an (a) entire 
(part of an) object, in both cases using various syntactic forms. 

The audio materials were recorded by a female speaker of Tokyo Japanese 
who was familiar with Japanese phonology. Each item came in two audio ver- 
sions. In one, the speaker had been asked to read the item to express the intended 
meaning of the LB interpretation. This version was the downstep (default prosody) 
condition. In the other version, the speaker had been asked to have the RB interpre- 
tation in mind. The second words (e.g., ne'ko) were pronounced with a raised pitch 
(relative to the first version) to counteract downstep. This version was the W2 pitch 
rise condition. Table 1 summarizes the relevant measurements of the experimental 
audio stimuli. 


Table 1: Acoustic profiles of the relevant parts (i.e., the syntactically ambiguous noun phrases) of 
the audio stimuli (mean duration and mean peak f0 values over 12 items. Figures in parentheses are 
standard deviations). 


color word N1 Gen N2 
condition ` Dur(ms)  fO(Hz) Dur (ms) f0 (H2) Dur (ms) f0 (Hz) Dur (ms) f0 (Hz) 
downstep 875 (166) 401(18) 697 (142) 346 (16) 250(25) 270 (23) 869 (158) 281 (18) 
W2 rise 853 (151) 387(19) 693 (143) 421 (9) 223 (22) 338 (26) 854(150) 275 (7) 


In all items, the second constituent and its genitive-marking particle in the down- 
step condition had significantly lower f0 peaks than their counterparts in the W2 
pitch rise condition of at least 50Hz, yielding a significant difference between the 
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Figure 2: A sample pitch track of the experimental item (1) in the two prosodic conditions. 
(Adapted from Hirose 2020 with permission.) 


conditions. The f0 peaks of the initial color words were significantly higher on 
average in the downstep than in the W2 rise condition. It can be considered antic- 
ipatory raising preceding downstepping elements (Rialland 2001) or the lowering 
of the word in the W2 rise condition before a planned rise. The genitive marker 
had a longer duration on average in the downstep than in the W2 rise condition. 
No durational difference between the two conditions was found in any other word. 


2.1.2.2 Visual stimuli 

All visual stimuli were identical to those used in Hirose (2020) (Experiment 2). 
Each visual display was divided into eight areas, each containing an object. Figure 
1 presented an example visual scene. The eight objects included the LB target and 
the RB target, each corresponding to the LB or the RB interpretation of the audio 
stimuli. There were also LB and RB distractors mimicking the design of LB and RB 
targets but featuring different animals with the same colors as the targets. The 
presence of such objects with the same colors should establish a contrastive rela- 
tionship with each of the LB and RB targets. Other objects were unrelated fillers, 
one of which had the same color mentioned by the audio sentence to serve as an 
additional color distractor. The positions of the different object types were bal- 
anced across items. 


2.1.3 Procedure 
Participants sat in front of a Tobii 1750 eye-tracker. After the participant fixated on 


the fixation cross, the visual display appeared on the screen. There was a 2500 ms 
delay between the presentation of the visual stimuli and the onset of the spoken 


Chapter 13 Developmental changes in the interpretation of an ambiguous structure === 265 


stimuli to allow participants to scan all eight objects. To respond to the sentence 
they heard, children were instructed to select an object by pointing with a stick, and 
the experimenter made the click input for them (the accuracy of response times 
was, thus, compromised). Participants eye movements were recorded from the 
onset of the audio stimuli until the click at a sampling rate of 50Hz. 

Each participant listened to all 12 experimental sentences, six in the downstep 
condition and six in the W2 pitch rise condition, arranged into two counterbal- 
anced lists. The presentation order was varied in each list and arranged such that 
no two experimental items were presented in a row. 


2.2 Data analysis and results 
2.2.1 Final target selection responses 


There were two alternative visual objects in this experiment, either of which could 
be considered the correct target (LB target or RB target), among the eight objects in 
the scene. The percentage of erroneous choices (fillers and LB and RB competitors, 
i.e., any choice other than the LB or RB targets) by four-year-olds and three-year- 
olds was 1.396 and 5.2596, respectively. None of the participants made more than 
two erroneous choices in the 12 experimental trials. This shows that the partici- 
pants in all groups were largely capable of handling the task of choosing one of 
the legitimate targets while eliminating six illegitimate candidates in response to 
the audio-linguistic stimuli. The erroneous cases were removed from the data. The 
children chose RB and LB targets at least once, instead of sticking to the same inter- 
pretation of the experiment. 

Participants' final clicking responses for each group were then analyzed in a 
generalized linear mixed model (GLM). The dependent variable was the binary 
target choice, where LB and RB target choices are coded as 0 and 1, respectively. 
The fixed factor was the prosodic manipulation (prosody), where W2 pitch rise 
and downstep conditions are effects coded before being centered. Participants 
and items were considered random factors. The final model was selected through 
backward selection to achieve the simplest possible model while ensuring com- 
parable explanatory power with the maximal random effects structure. Figure 3 
shows the percentages of the LB and RB binary target choices on each condition 
in each group. 

GLM analyses were conducted separately for the three- and four-year-olds and 
for both age groups combined, with the LB or RB binary choice as DV, where both 
the prosody and age in months (scaled) were considered as fixed factors. Notably, 
the four-year-olds exhibited a response bias toward RB (as shown by the intercept 
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Figure 3: Proportion of RB and LB binary choices based on the subject means for each age group in 
the two prosodic conditions with 95% CI bars (shaded areas are for RB and unshaded areas for LB). 


significantly greater than zero, z = 3.913, p < .001). Across-age group comparisons, 
with age group (age in years) as a fixed factor, revealed that the four-year-olds had 
a reliably stronger RB bias than the three-year-olds (the effect of age group: z = 
-2.168, p < .05). Three-year-olds showed an overall lack of the LB/RB response bias 
(intercept for three-year-old group: z = 1.272, p > 0.1). As far as the final object selec- 
tion was concerned, the prosodic manipulation had no reliable impact for either 
age group of children (effect of prosody for the three-year-old group: z = 1.226, p > 
0.1, Four-year-old group: z = -0.953, p > 0.1), and there was no interaction between 
the prosodic manipulation and age in months (z = 1.362, p > 0.1). The lack of a pro- 
sodic effect in four-year-olds is not consistent with what is suggested by the eye- 
movement data reported below. 


2.2.2 Eye-tracking data 


The eye-movement data analysis mainly examined participants' gaze bias between 
the LB and RB targets over time. First, the gaze data were broken down into dis- 
crete 100 ms time windows (1-100 ms, 101—200 ms etc.), which contained up to five 
sampling points. This way, the impact of tracking loss is minimized while allowing 
sufficiently detailed detection of the time course of the eye-gaze pattern. The first- 
time window started from the offset of the second word (e.g., ne'ko), where the 
prosodic manipulation (downstep vs. W2 pitch rise) becomes distinct but before 
further information (as to whether the pitch rise is followed by post-focal reduc- 
tion) was available. For each of the visual targets (LB and RB targets), the number of 
gazes was summed for each window to calculate the logged ratio between the sum 
of gazes on the left-branching target over the sum of gazes on the right-branching 
target. It is referred to as log-ratio, representing the gaze bias score (Jincho, Oishi, 
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and Mazuka 2019). In this experiment, the log-ratio of the positive value shows a 
bias toward the LB target, while that of the negative value shows a bias toward 
the RB target. Zero indicates that participants considered alternative targets to the 
same extent in the given time window. 

The final goal of the eye-movement analysis was to see whether the gaze bias 
between the two alternative targets was affected by the prosodic manipulation 
(W2 pitch rise), while the participant was listening to a sentence, where an effect 
type may also be subject to change at different stages of development. The exact 
time course of these possible effects in different age groups, therefore, cannot be 
predicted. For this reason, our eye-movement analyses took two steps. The first 
analysis aimed to identify reliable time intervals informative of the prosodic effect 
in an objective way, as opposed to the researchers' subjective choice of the time 
windows. The second set of analyses used linear mixed-effect models to probe the 
prosodic effect on the log-ratio (gaze bias between the two branching targets), con- 
sidering participant and item random factors. 

For the first stage (identifying the time intervals for analyses), we used the 
non-parametric permutation-based test (Maris and Oostenveld 2007), which is a bot- 
tom-up approach adopted in visual world paradigm studies with Japanese-speak- 
ing child populations (Hirose and Mazuka 2017; Jincho et al. 2019). We followed the 
same procedure used by Hirose and Mazuka (2017), which is also similar to that 
employed by Jincho et al. (2019), to identify the time interval in which the effect of 
conditional manipulation was reliable on an objective ground. 

Once the relevant time intervals to inspect were decided, linear mixed-effect 
models (LME, using the Ime4 package in R) were utilized to analyze the log-ratio 
(expressing the fixation bias between LB and RB targets), with the prosodic manip- 
ulation as a fixed factor. The random factors included participants and items. The 
model with the maximal random structure was first considered, and the final 
model was selected via a backward selection procedure among converging models. 


2.2.2.1 Three-year-olds 

Figure 4 shows the proportion of fixation on each visual object. No reliable differ- 
ence between the two prosodic conditions can be inferred from the outcome: the 
permutation-based analysis identified no cluster of time windows, suggesting the 
prosody effect. 
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Figure 4: Plotted log-ratio in downstep and W2 rise conditions for 0-4000 ms for three-year-olds. 
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Figure 5: Plotted log-ratio in downstep and W2 rise conditions for 0-4000 ms for Four-year-olds. The 
shaded area (2500-2900 ms) indicates the time interval identified by the permutation-based analysis. 
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2.2.2.2 Four-year-olds 

The log-ratio graph in Figure 5 suggests an overall bias for the RB target, relative to 
the LB target after the entire NP has been heard. This bias was enhanced with the 
W2 rise: The permutation-based analysis detected a 2500—2900 ms time interval 
(the shaded graph area) where the W2 rise condition led to a reliably larger RB bias 
than the downstep condition. The timing roughly coincided with when the partici- 
pants made their clicking responses as they decided between the LB and RB targets. 
The LME analysis conducted for the time interval confirmed that the bias for the 
RB target increased with W2 rise (B = —-0.182, SE = 0.091, t = -2.00, p < .05), where 
the selected model included the participant random slope with the participant and 
item intercept. 

We ran two further LME analyses separately per the final target choice, where 
theselected models had the same random effect structure as the analysis above on all 
participants. The effect of prosody remained reliable (B = -0.232, SE = 0.102, t = -2.28, 
p < .05) for the subset of the data where the RB target was selected. The prosodic 
effect was not observed, and there was no reversed trend when the final choice was 
the LB target (B = -0.1124, SE = 0.108, t = —1.04, p > .05). Thus, the observed prosodic 
effect on the gaze bias toward the RB target led to more RB choices in the final 
interpretation. Meanwhile, the transient bias toward the opposite direction (the 
LB) among adult participants in the prior study did not show up in the eye-move- 
ment data in this study. 


2.3 General discussion 
2.3.1 Processing bias in the branching ambiguity across groups 


Among the four-year-old group is an overall processing bias toward RB in the 
Adjective * NP-GEN * NP structure processing in the final target choice and the gaze 
bias between the LB and RB targets. These findings run counter to most processing 
models' presumption of a locality advantage. Moreover, the RB bias is not in line 
with the standard assumption of a phonology-syntax interface in Japanese, where 
the RB structure is considered a marked construction requiring some phonological 
demarcation, whereas the LB structure is accompanied by default prosody (Kubo- 
zono 1988). 

For adults, the LB structure would still have an advantage in online sentence 
comprehension as it can be achieved by assigning most local structural positions to 
the incoming input as a sentence unfolds. However, young children, with a limited 
capacity for rapid incremental processing, may not enjoy the locality advantage as 
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much as adults. Their processing choices may be more influenced by the ease with 
which certain structures are constructed or understood, such as the operational 
ease associated with Pot-Merge (Greenfield 1991; Fujita 2017), which corresponds 
to the RB structure here, relative to Sub-Merge, which in this case matches the LB 
structure. We have less reason to believe the observed RB bias comes from four- 
year-olds’ inability to construct any hierarchical structure across the board, con- 
sidering the children’s sensitivity to the prosodic manipulation when they were 
selecting the target, which was revealed by the eye-tracking data. That is, the four- 
year-olds interpreted the f0 rise on the second word as a realization of a syntax-sen- 
sitive prosodic cue (e.g., metrical boost), indicating that the children had already 
constructed an RB structure. 

Such a processing bias was not observed in the three-year-olds. All partici- 
pants chose each of the different structure types at least once, instead of persisting 
with one interpretation and applying it to all trials. However, we found no consist- 
ent pattern for the group as a whole. The young participants could perform the 
complex task, choosing one target among eight objects, thereby making few errone- 
ous responses. Even so, we have no solid evidence that the successful choice meant 
success in constructing the relevant syntactic structure for real-time comprehen- 
sion. This issue warrants investigation further. 


2.3.2 Interpretation of the pitch prominence 


The main finding of the prosodic manipulation is again from the four-year-old chil- 
dren. In this group, the W2 pitch rise increased looks to the RB targets relative 
to looks to the LB targets not in the middle of processing the complex NP but at 
the stage when the participants are about to make the clicking choice. It makes 
sense if the W2 rise is interpreted as a metrical boost because it would only be 
relevant once RB syntax becomes a possibility (i.e., not until the second noun, e.g., 
ka’sa “umbrella”). The finding suggests that children at age four can construct both 
branching structures and recognize the appropriate prosodic pattern to demarcate 
the RB syntax during processing. 

The lack of a prosodic effect in the three-year-old group makes it difficult to 
reach a definitive conclusion as to whether they did not recognize the prosodic 
difference or had yet to learn how the distinct prosodic patterns correspond to the 
two branching structures. However, alternatively, they could have been making 
random choices between the two probable enough options (where a cat and an 
umbrella both have at least some blue color) without really building the appropri- 
ate hierarchical structure, in which case the prosodic manipulation would not be 
relevant. This age group would need to be tested with a less demanding task for 
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further investigation. It may also be worth checking the possibility that the younger 
population is more sensitive to the effect of relative plausibility between the two 
alternatives (e.g, a blue umbrella vs. a blue cat), although they understood the 
objects mentioned to refer to illustrations of the creature. 


3 Conclusions 


Some outcomes of this experiment provided informative answers to our original 
questions; others remain inconclusive. Positive conclusions are as follows. First, 
the processing bias for the branching ambiguity was different between adults and 
young children. At age four, the overall bias was toward the RB structure, which is 
considered marked and goes against the locality advantage in adult sentence pro- 
cessing. With the advantage of rapid online sentence comprehension put aside, the 
RB analysis is readily available and more favored by four-year-olds. 

Second, the interpretation of the prosodic information was also different in 
four-year-old children in that they allowed for only a single interpretation of the 
prosodic signal, unlike adults (Hirose 2020). The eye-tracking experiment demon- 
strated that the W2 rise increased looks to the RB target relative to the LB target, 
suggesting that RB syntax-marking prosodic information (e.g., metrical boost) can 
be appreciated by children at age four. With such knowledge, young children can 
construct an RB hierarchical NP, thereby lessening the likelihood of the possibility 
that they arrived at the interpretation through a non-hierarchical analysis. 

Third, the above-mentioned overall RB bias (enhanced by the W2 pitch rise) 
of the four-year-olds was not observed among the three-year-old population. In a 
future study, we intend to test these younger children with a simpler task (with a 
simpler visual scene) to at least eliminate the possibility that the participants were 
engaged in tasks other than linguistic processing. 
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Maria Polinsky, Hisao Kurokami, and Eric Potsdam 


Chapter 14 
Exceptive constructions in Japanese 


1 Introduction 


Exceptives are constructions that express exclusion, as in (1). They typically com- 
prise an EXCEPTIVE PHRASE, which excludes the EXCEPTION from the domain of an 
ASSOCIATE. In (1), everyone is the associate, except Mary is the exceptive phrase, and 
Mary is the exception. An EXCEPTIVE MARKER usually introduces the exception. In 
English, this can include except, but, besides, and except for. 


(1) Everyone laughed  [except/but/besides/exceptfor ^ Mary] 
ASSOCIATE EXCEPTIVE MARKER EXCEPTION 
[... EXCEPTIVE PHRASE Eval 


Moltmann (1995), von Fintel (1993), Kleiber (2005), Garcia Alvarez (2008), Gajewski 
(2008, 2013), Crnié (2018), and Galal (2019) provide explicit semantic characteristics 
of exceptive constructions, describing how they differ from restriction, addition, 
reservation, opposition, and concession. We follow them in identifying the range 
of constructions to investigate. It is also vital to separate constructions specifically 
dedicated to expressing exclusion from those that express exception as a corollary, 
particularly, focus constructions with only, as in (2), where the exceptive reading 
is an inference. 


(2 Only Mary laughed. 


Beyond the cited references, the literature on exceptives is quite small, focusing 
largely on the construction's semantics, getting the right interpretation and infer- 
ences (Hoeksema 1987, 1995; Keenan and Stavi 1986; von Fintel 1993; Moltmann 
1995; Lappin 1996; Zuber 1998; Peters and Westerstáhl 2006; Gajewski 2008; García 
Álvarez 2008; Hirsch 2016). There is little syntactic work and no typological studies 
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(Reinhart 1991; Sava 2009; O'Neill 2011; Pérez-Jiménez and Moreno-Quibén 2012; 
Soltan 2016; Potsdam and Polinsky 2017, 2019; Potsdam 2018a, 2018b, 2019; AL 
Bataineh 2021). In syntactic work, one can address the following questions: how are 
exceptives expressed grammatically? Do some exceptives involve ellipsis of some 
kind to account for their interpretation? 

This chapter seeks to fill some of these gaps by examining syntactic proper- 
ties of the exceptive construction in Japanese, marked by the exponent igai, whose 
grammatical status we explore in section 4.1. While the main thrust of this chapter 
lies with the general description of Japanese exceptives, we hope for this discus- 
sion to stimulate experimental studies informed by our hypotheses; at several 
points in the chapter, we highlight possible experimental studies. In pursuing a 
syntactic description and analysis of Japanese exceptive constructions, we note 
the difference between connected and free exceptives, which are of interest to 
semanticists and syntacticians alike, and focus on the choice between the phrasal 
and clausal foundation of free exceptives. These issues inform the structure of the 
chapter. Section 2 introduces the difference between connected and free excep- 
tive constructions. Section 3 presents diagnostics designed to determine whether 
Japanese free exceptives are underlyingly phrasal or clausal. Section 4 discusses 
the derivation of free exceptives. Section 5 addresses several outstanding issues 
raised by the proposed analysis. Finally, section 6 briefly lists exceptive impostors: 
constructions that can convey the meaning of exclusion to a generalization as an 
inference, similar to the example in (2). 


2 Connected and free exceptives 


As with the English besides, which can introduce exceptions, igai has two core 
meanings: additive and subtractive/exceptive. An example of the additive meaning 
of igai is given below:! 


(3) Wid RAW ICY PHS AUS. 
Watashsi-wa  eigo-igai-ni roshiago-o — hanas-e-ru. 
1sG-TOP English-except-NI Russian-ACC speak-able-PRs 
“Besides English, I can speak Russian." 


The ambiguity between additive and exclusion readings of exceptive markers 
seems to be common cross-linguistically (Sevi 2008; Vostrikova 2019) and certainly 


1 Abbreviations follow the Leipzig Glossing Rules. 
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deserves a separate investigation, but we will not pursue it here. In what follows, 
we will concentrate only on the exceptive function of igai. 

The consensus understanding of exceptives, based on the earliest semantic 
work (Hoeksema 1987, 1995), recognizes a distinction between FREE and CON- 
NECTED exceptives, which refers to the surface position of the exceptive phrase 
regarding the associate. In connected exceptives, the associate and the exceptive 
phrase are adjacent and form a syntactic constituent, (4a).? In a free exceptive, it 
is the reverse (4b). 


(à a. EHk £ n UA OI) TN TOBOFRRE. 


Kinoo-wa [Hiro-igaiCno/*wa) subete-no otokonoko-ga] 
yesterday-TOP H-except-GEN-TOP all-GEN boy-NOM 
ki-ta. 

come-PST 


“Yesterday, every boy except Hiro came.” 
b CAUCE /*O)WEeH IE TX TORO FARE , 


Hiro-igai(-wa/*no) kinoo-wa subete-no otokonoko-ga] 
H-except-GEN-TOP  yesterday-TOP  all-GEN boy-NOM 
ki-ta. 

come-PST 


“Yesterday, every boy came, except Hiro.” 


As the examples indicate, connected and free exceptives differ in their marking. 
Although both types are introduced by igai, the left-peripheral free exceptive 
phrase can be marked by the topic particle wa and cannot co-occur with the par- 
ticle no;? for the connected exceptive (4a), only no is possible. Several properties 
distinguish connected exceptives from free exceptives; Table 1 shows the main 
characteristics. 

As we consider Japanese exceptives marked by igai, at least two of the proper- 
ties in this table deserve special consideration. Regarding Property 2, Japanese does 
not line up as neatly as the more familiar English or Spanish where this property 
has been considered. By subtracting from the domain of a quantifier, connected 
exceptives are claimed to be subject to the Quantifier Constraint (QC) in (5) (Hoek- 
sema 1987, von Fintel 1993, Moltmann 1995), which restricts this quantifier to 


2 Brackets indicate what elements constitute the subject. 

3 Characterizations of no differ per its distribution and also on research sources. It is often de- 
scribed as the genitive marker, which is how we represent it in the glosses. However, its functions 
seem to be broader than that of the genitive. In our discussion, we refer to it as a particle. Nothing 
hinges on this characterization for the purposes of this study. 


278 == Maria Polinsky, Hisao Kurokami, and Eric Potsdam 


being a universal or negative universal, (6). Free exceptives are not restricted by 
the QC. The main clause need only be a generalization, which can admit exceptions, 


as in (7). 


Table 1: Differences between connected and free exceptives. 


Property Connected exceptive Free exceptive 

1 Semantics Subtracts from the domain ofa ` Expresses an exception to 
quantifier a generalization 

2 Associate types Certain quantified noun XPs in general statements 
phrases only (universals) 

3 Syntactic relation in clause Nominal modifier Clausal modifier 


4 Position in clause Adjacent to associate Clause-peripheral or in 
parenthetical position 
5 Constituency Forms a constituent with the Not a constituent with the 
associate associate 

6 Category of exception Nominal only Not restricted to nominals 

7 Realization of associate Must be syntactically realized May be implicit 

(5) Quantifier Constraint (Moltmann 1995: 227) 
The NP that an exceptive phrase [in a connected exceptive] associates with 
must denote a universal or negative universal quantifier. 

(6) a. Every boy/All boys/No boy except John came. 
b. *Few boys/Most boys/Three boys/At least three boys/The boys/Boys except 

John came. 
(7 a. Fewknow that Colorado produces wine, except visitors. 


However, in Japanese, connected exceptives are possible with non-universal quan- 


b. Thejudges gave her a standing ovation, except Simon Cowell. 


tifiers: 


(8 Xnuv7UBMOlxeAEARII/UbE < ES)EAJOBOTSXI. 


Taroo-igai-no 


ki-ta. 
come-PST 


*Most/(At least) three boys except Taro came." 


{hotondo/takusan/(sukunakutomo) san-nin}-no  otokonoko-ga] 
T-except-GEN most/many/at least three-CLF-GEN 


boy-NOM 


Chapter 14 Exceptive constructions in Japanese == 279 


These examples indicate that the constraint on universal quantifiers in the associ- 
ate is too strong. It accords with the considerations by García Álvarez (2008: 13-21) 
and Galal (2019) who indicate that in English, apparent connected exceptives may 
also violate the QC. All these data indicate that more semantic explorations into the 
nature of the QC generalization are needed. 


(9) a. Salvias are native to most continents except Australia. 
b. There was little furniture except our big fridge in the corner of the living 
room. 
c. English policemen, except the guards who protect the royal family, do not 
carry guns. 


Property 7 is the other characteristic where Japanese exceptives differ from the 
more familiar English ones. Assuming only free exceptives are clause-peripheral 
(see Property 4), excluding the ones with parenthetical intonation, all clause-inter- 
nal exceptives should be ofthe connected type, appearing with an explicit associate 
because the exceptive phrase must have a syntactic constituent to modify. However, 
it is not the case. In (10, 11), there is no overt associate.^? 


00 ZnviyÉzJdUMGCE) ERAI. 
Taroo-wa  ringo-igai(-0) tabeta. 
T-TOP apple-except-acc ate 
“Taro ate everything except the apple.” 


(11) Au HA CU EO holier, 
Nattoo-wa ` nihon-de-igai amari mikake-nai. 
natto-TOP Japan-in-except much  see-NEG.PRS 
“Except Japan, we do not see matto much anywhere.” 


We will return to these examples in section 5.3 after we have examined the differ- 
ence between clausal and phrasal exceptives, to which we now turn. 


4 It seems speakers vary on whether the accusative case marker o can be dropped in (10). For 
many of the Japanese speakers consulted, omitting o in sentences such as (10) does not seem to 
affect their grammaticality. 

5 It seems that speakers vary on whether having de before igai in (11) is acceptable. While some 
speakers note that the sequence de-igai is degraded, most of the Japanese speakers consulted found 
this word order to be well-formed. 
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3 Clausal and phrasal exceptives 


While the free versus connected exceptive distinction is important, it is only part 
of the picture. In expanding the descriptive space for the cross-linguistic investiga- 
tion, another additional parameter of variation is important: phrasal versus clausal 
exceptives. This distinction has received far less attention in the literature because 
it is primarily syntactic and not semantic. Initial appearances may suggest that an 
exception such as Mary in Everyone left, except Mary is simply a noun phrase (ND); 
however, work on Egyptian Arabic (Soltan 2016), English, Russian, Tahitian, Mala- 
gasy (Potsdam 2018a, 2019; Potsdam and Polinsky 2017; 2019), and Spanish (Pérez- 
Jiménez and Moreno-Quibén 2012) suggests that exceptions may contain a hidden 
clausal structure reduced by an ellipsis. In a PHRASAL EXCEPTIVE, the exception is a 
direct phrasal complement to the exceptive marker, (12a). However, in a CLAUSAL 
EXCEPTIVE, the exception is part of a larger constituent that is clausal (12b). Material 
within this clause may have been deleted, giving the appearance of a smaller con- 
stituent (a suggestion first made in Harris 1982). 


(12) a. Nobody left, [except [Mary]yp ] PHRASAL EXCEPTIVE 
b. Nobody left, [except [Mary teft]¢p ] CLAUSAL EXCEPTIVE 


Phrasal and clausal exceptives may co-occur in the same language and may be 
marked in formally distinct ways, as is the case in Russian (Oskolskaya 2014; 
Potsdam and Polinsky 2019). However, it is also possible that the surface realization 
of an exceptive construction may not be telling enough to determine its underlying 
syntax. Regarding free exceptives in Japanese, one could imagine two possible sce- 
narios, corresponding to (12a) and (12b) respectively. On the phrasal scenario, the 
exception is a simple nominal and the exceptive phrase is optionally marked by the 
topic particle wa.’ 


6 Itis instructive to draw parallels between the exceptive and comparative constructions. In phrasal 
comparatives, the complement of than is a single phrase, typically a determiner phrase (DP), whereas 
in clausal comparatives, the complement of than is a clause (often with ellipsis). The ellipsis of clausal 
material in a clausal comparative makes it indistinguishable from the phrasal one on the surface, 
and special diagnostics are needed to tell them apart (cf. Bresnan 1973; Bhatt & Takahashi 2011). 


(i) a John is older [than [Mary]pp ] PHRASAL EXCEPTIVE 
b. John is older [than [Mary is]¢p ] CLAUSAL EXCEPTIVE 
7 The hypothesis remains neutral on whether the exceptive phrase originates inside the quantified 


associate and moves to the clause-initial position or whether the it is base-generated in the initial 
position. 


Chapter 14 Exceptive constructions In Japanese === 281 


(13) phrasal analysis of Japanese free exceptives 
Mearii-igai(-wa) ` paati-ni minna(-ga) ki-ta. 
Mary-exceptTOP ` party-to all-NoM] come-PST 
“Except Mary, everyone came to the party.” 


In the clausal scenario, the associate and the expression of exception do not forma 
constituent at any level of representation. They start in separate clauses, and some 
of the identical material is deleted under ellipsis:* 


(14) phrasal analysis of Japanese free exceptives 
[[Mearii-ga paati-ni ki-ta] igai]lCwa) minna(-ga)  paati-ni 
Mary-NoM  party-to come-PsT except-TOP all-NoM] party-to 
ki-ta. 
come-PST 
*Except Mary, everyone came to the party." 


In either derivation, the surface form of the free exceptive is the same. Diagnostics 
distinguishing phrasal and clausal exceptives are needed to decide between these 
two approaches. We summarize the core ones in Table 2. The list presented here is 
not exhaustive but sufficient to identify the category of the constituent introduced 
by igai and will allow us to compare Japanese with other languages whose excep- 
tives have been studied. It also allows for concentrating on some diagnostics that 
are less clear-cut or have not been studied extensively, in particular, D3 and D7. 


Table 2: Diagnostics differentiating between phrasal and clausal exceptives. 


PHRASAL EXCEPTIVE CLAUSAL EXCEPTIVE 


1 Exception can be a full clause no yes 
2 Multiple exceptions no yes 
3 Fixed form of nominal exception yes no 
4 Clausal/speaker-oriented adverbs no yes 
5 Separate binding domains no yes 
6 Ambiguity in sluicing no yes 
7 Internal reading with “same, different" yes no 


8 In such cases, a particular issue must do with the change in polarity between the two clauses, 
which is necessary for identity of the elided material and the material in the antecedent. We will 
return to this issue in section 5.2. 
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Diagnostic 1: The most straightforward diagnostic is that clausal exceptives allow 
full expression of the missing clausal material, as in (15), while this is impossible in 
phrasal exceptives. 


(15) They did not invite anyone, except they invited Mary. 


In Japanese free exceptives (an entire clause with the exception in it) can be 
expressed: 


(16) X7 Y — EEL WSS TAG rose 22 o E. 
Meariio  shoutaishi-ta-igai-wa karera-wa onnanoko-o 
Mary-Acc invite-PsT-except-TOP they-TOP girl-acc 
shoutaishi-nakat-ta. 
invite-NEG-PST 
“They did not invite any girls, except they invited Mary.” 


(17) Zn BIRAZ SWS RE b Sha E aE k Ao 
Taroo-ga  eigo-o hanas-e-ru-igai-wa 
Taro-NoM they-Acc speak-can-PRs-except-TOP 
daremo ` gaikokugo-o hanas-e-mas-en. 
nobody  foreignlanguage-ACC speak-can-POLITE-NEG 
“No one speaks a foreign language, except that Taro speaks English." 


Such data point to a clausal analysis of Japanese free exceptives. 


Diagnostic 2: Clausal exceptives allow for multiple exceptions, which do not form 
a single constituent, while phrasal exceptives do not. We discuss the mechanism 
by which exceptions might escape the clausal ellipsis below; however, the contrast 
follows from the reasonable assumption that this mechanism is iterative, while the 
exceptive marker in phrasal exceptives cannot select multiple complements. 


(18) Every boy danced with every girl, except [John] [with Mary]. 


Multiple exceptions are grammatical although dispreferred in Japanese free excep- 
tives. We hypothesize that this dispreference may stem from processing factors; 
because of the rigidly head-final nature of Japanese, the free exceptive must 
precede the clause stating the generalization, and holding several exceptions that 
must be linked to associates in working memory may cause discomfort. Further 
research can test to see whether this explanation is correct. 
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(19) V ay% APA ic ED EHI Zero Stot #B Jt G 
&Jc. 
[Jyon-o] [Tanaka-sensei-ni]-igai(-wa) kinoo-wa 


John-Acc Tanaka-teacher-DAT-except-TOP  yesterday-TOP 
[subete-no gakusei]-o [subete-no ` sensei]-ni syookai-deki-ta. 
all-GEN Student ACC  all-GEN teacher-DAT introduce-able-PST 
*No one speaks a foreign language, except that Taro speaks English." 


Additionally, an anonymous reviewer notes a contrast in grammaticality when 
different case markers are used in free exceptives. As shown here, pronouncing 
accusative and dative case markers on the respective NPs does not affect the gram- 
maticality of a sentence. However, the use of the nominative marker is marginal 
at best. For example, (14) is heavily degraded if Mary appears with a nominative 
case marker. We hypothesize that it has to do with the difference in the informa- 
tion-structure import of ga vs wa. In root clauses, the former is used to mark back- 
grounded information and is commonly found in thetic clauses (Kuroda 1972). 
Such encoding is incompatible with the contrastive interpretation expected of an 
exception. Further, the structure we propose in (42b) below involves topicalization 
of the exception, which calls for wa, not ga. 


Diagnostic 3: The exception in a clausal exceptive can be non-nominal, while that 
in a phrasal exceptive must be nominal. The possibility of a non-nominal exception 
follows if the mechanism that allows the exception to avoid ellipsis is insensitive 
to the category of the exception. However, with phrasal exceptives, the exceptive 
marker selects only nominal complements. This contrast obtains in Japanese. In 
Japanese connected exceptives, which we believe are phrasal, the exception is 
always nominal and is incompatible with a postposition, (20). In free exceptives, 
a postposition is possible, preceding or following igai [we set aside interpretive 
differences between examples such as (21a) and (21b)].? 


Q0) WZ HRT) WIM OONCHED MOUS. 
Nattoo-wa ` nihon-(*de-)igai(-de)-no kuni-de amari mikake-nai. 
natto-TOP Japan-in-except-in-GEN  countryin much  see-NEG.PRS 
“We don’t see natto much in countries other than Japan.” 


9 See section 4 for structural differences between the two orders of postposition and exceptive 
marker. 
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(21) a. 


HADI OE) A iE COR CEHEYO MDI 2 u, 

Nihon-igai(-wa) nattoo-wa donokunidemo amari  mikake-nai. 
Japan-except(-TOP) natto-TOP any.country much see-NEG.PRS 
HA CWA) ME EDET E Dk A DUEL. 
Nihon-de-igai(-wa) nattoo-wa donokunidemo amari mikake-nai. 
Japandn-except(CTOP) natto-TOP any.country much  see-NEG.PRS 
HAEC? Ud) MIL COB] G t p š D) HU Z uY, 
Nihon-igai-de-?(wa)  nattoo-wa donokunidemo amari 
Japan-exceptin(-TOP) natto-TOP — any.country much 
mikake-nai. 

See-NEG.PRS 

*Except Japan, we don't see natto much in any country." 


Diagnostic 4:10 Clausal exceptives allow for a clause-level adverb in the exception, 
as in (22), while phrasal exceptives do not, as in (23).!! The basis for this diagnostic 
is the assumption that temporal adverbs and speaker-oriented adverbs require a 
clause to modify and cannot modify nominals. 


(22) a. 
b. 
C. 


(23) a. 


b. 


I was able to meet everyone, except regrettably/unfortunately/sadly Mary. 
I will go to any party, except yours tomorrow. 
The workers always eat here, except Juan on Mondays. 


*Everyone except regrettably Mary came to the party. 
*No party except yours on Tuesday was attended by the mayor. 


In Japanese, the contrast between connected and free exceptives is observed with 
modal and speaker-oriented adverbs. Consider the following pair: 


10 This diagnostic is developed and applied in Pérez-Jiménez and Moreno-Quibén (2012), Soltan 
(2016), and Vostrikova (2021). 

11 Examples such as (23) must be read without parenthetical intonation that would allow for a 
clausal structure. 
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(24) a AF AWOL TOKO f MH 2 TW SILO best d 4 — U 3 
£3. 
Hanako-igai-no subete-no onnanoko-ga 
H-except-GEN — all-GEN girl-NOM 
sitteirukagiri/tabun paatini ki-mas-u. 
based.on.my.knowledge/perhaps party-to come-POLITE-PRS 
*Based on my knowledge/Possibly, all girls except Hanako will come to 
the party." 
NOT: “Except, based on my knowledge/possibly, Hanako, all girls will 
come to the party." 

b AVF aD Mo TO SIRO /BAN-F 4 & < O ZO f. MAR 


ET. 

Hanako-igai-wa sitteirukagiri/tabun paati-ni 
H-except-TOP based.on.my.knowledge/perhaps  party-to 
subete-no onnanoko-ga  ki-mas-u. 

all-GEN girl-NOM come-POLITE-PRS 

*Based on my knowledge/Possibly, all girls except Hanako will come to 
the party." 

96 *Except, based on my knowledge/possibly, Hanako, all girls will come 
to the party.”” 


In (24a), a connected exceptive, the adverbials “based on my knowledge” and 
“perhaps, possibly” necessarily scope over the entire clause. In (24b), the scope of 
the adverbial is ambiguous; it can be interpreted as scoping over the entire clause 
or just over the negative entailment that Hanako will not come. This latter inter- 
pretation suggests that the adverb is enclosed only under one clause (with material 
deleted) and not associated with the main clause (thus, the elided material is indi- 
cated with < >): 


(25) [Hanako-igai-wa  sitteirukagiri/tabun <...>] paati-ni subete-no 
H-except-TOP based.on.my.knowledge/perhaps party-to all-GEN 
onnanoko-ga  ki-mas-u. 


girl-NOM come-POLITE-PRS 
*Except, based on my knowledge/possibly, Hanako, all girls will come to the 
party." 


12 Not all the speakers we consulted get the reading where the tense phrase (TP) adverbial 
Scopes just over the exception. Further work is needed to understand what may cause cross- 
speaker variation. 
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The two canonical positions of clausal adverbs are right before and after the subject 
(Koizumi and Tamaoka 2010). Assuming such positions, the two readings of the 
example with a free exceptive result from a structural ambiguity in which there are 
two clauses: the adverb may be interpreted either within the exceptive clause or 
the main clause expressing the generalization (all the girls will come to the party). 
The two placements should be distinguishable by prosodic contours, an issue we 
leave for further research. Crucial for our purposes is the fact that the connected 
exceptive does not show ambiguity in the scope of clausal adverbials because there 
is only a single clause. 


Diagnostic 5: Assuming a free exceptive is clausal, each of the linked clauses con- 
stitutes its local binding domain. In that case, binding can be found in one of the 
clauses but not in both, as in the following English example; the corresponding 
connected exceptive is ungrammatical because multiple exceptives are impossible 
(see D2). 


(26) a. Nobody made any gains for anyone, except John for himself. | CLAUSAL 
b. *Nobody except John for himself made any gains for anyone. PHRASAL 


Japanese free exceptives also show separate binding domains: 


(27) AFAMAADE EWI HED TS DATEL TORU, 
Hanako-ga zibun-no-koto-igai-wa] [daremo nanimo 
H-NOM self-GEN-thing-except-TOP [nobody anything 
sinpai-shite-i-nai]. 
WOITy-dO-PRS-NEG.PRS 
*Except for Hanako about herself, nobody is worried about anything else." 


Diagnostic 6: A diagnostic based on Sluicing is developed by Stockwell and Wong 
(2020) (initially noted in Merchant 2001: 22). The authors observe that an example, 
as in (28), is ambiguous. In (283), the content of the missing material is supplied by 
the entire first clause, including the exceptive phrase, serving as the antecedent. 
The interpretation in (28b) is mysterious, as the required antecedent John liked the 
movie is absent. Stockwell and Wong (2020) argues that this interpretation is avail- 
able because the exceptive contains a hidden clausal structure, as in (29), which 
supplies the needed antecedent. 


(28) Nobody liked the movie, except John, but I don't know why. —— CLAUSAL 
a. butIdon'tknow why «nobody liked the movie except John». 
b. but I don’t know why «John liked the movie». 
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(29) Nobody liked the movie, except John Hiked-the-movie, but I don't know why. 


Phrasal exceptives in English do not allow for the second reading, as the antecedent 
needed for reading (30b) is simply not available. 


(30) Nobody except John liked the movie, but I don't know why. ` PHRASALS 
a. but I don’t know why «nobody except John liked the movie». 
b. *but I don’t know why «John didn't like the movie». 


The situation in Japanese is more nuanced. Consider the following example with a 
free exceptive: 


(31) Au7U5 A CÓA«mEOCoIZUE. llo (I NE Azo. 
Tarooigai-wa  kaigi-de minna okot-te ta-kedo, 
T-except-TOP meeting-at all get.upset-GER PST-CONJ 
nazeka(-wa) wakar-anai. 
why(-TOP) understand-NEG.PRS 
“Except Taro, everyone was upset during the meeting, but I don't understand 
why." 


Assuming the underlying clausal structure in a free exceptive, we should expect 
two readings: (i) the speaker does not know why everyone except Taro was upset, 
and (ii) the speaker does not know why Taro was not upset. However, most Jap- 
anese speakers we consulted only accept reading (i). It is not entirely clear why 
reading (ii) is not available, and examples such as (31) add a new dimension to the 
investigation of sluicing and related phenomena in Japanese. 

At this point, we would like to offer a couple of considerations. First, it is pos- 
sible that reading (ii) is blocked because of the nature of the deletion in the sluiced 
clause. Thus, to anticipate our discussion in section 4, the exceptive clause is nom- 
inalized, which may preclude the necessary identity required to license ellipsis in 
sluicing. That alone does not constitute an explanation but adds more complexity 
to the already murky issue of clausal ellipsis in Japanese (Merchant 2001: 84-85; 
Yoshida, Nakao, and Ortega-Santos 2014). It is not clear if nominalized clauses can 
antecede sluicing in Japanese (Masaya Yoshida, p.c.). Second, another possible 
explanation has to do with the insufficient context supplied by the construction in 
(31), something that could be ascertained in an experimental study; however, the 
question still arises as to how exactly English and Japanese free exceptives differ 
per the sluicing diagnostic. 
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Diagnostic 7: The richness of context regarding D6 also plays a significant role 
in the last diagnostic: ambiguity of the interpretation with the words different or 
same (based on Beck 2000). These words can have discourse-anaphoric and recip- 
rocal-like readings, as illustrated in (32). We term them as external and internal 
readings (Beck 2000 calls them discourse-anaphoric and Q-bound readings). 


(32) Every student reads a different book. 
a. Every student reads a book that is different from a salient book in the 
discourse. 
EXTERNAL READING 
b. Every student reads a book that is different from the one that any other 
student reads INTERNAL READING 


This ambiguity can serve as a diagnostic for clausal exceptives. Phrasal, not clausal, 
exceptives, allow for internal reading: 


(33) a. Every student reads a different book. AMBIGUOUS 
b. Every student reads a different book, except Mary. 

EXTERNAL READING ONLY 

c. Every student except Mary reads a different book. AMBIGUOUS 


The reason that the internal reading is not available in the clausal exceptive can 
be seen by looking at the non-elliptical version in (34). The exceptive clause Mary 
didn't read a different book has only an external reading, as there is no quantifier to 
trigger the Q-bound reading. 


(34) Every student reads a different book, except Mary doesn't read a different 
book. 


If this contrast is genuine, then it provides us with a way to probe the internal 
structure of exceptives in languages that allow for similar ambiguity for the words 
different or same. In Japanese, the word 3& 2 tigau “different” allows for the same 
ambiguity. 


(35) £ C O?£E23É 2 RE eA CG, 
Subete-no gakuseiga tigau hon-o yon-da. 
all-GEN student-NOM different  book-AcCc  read-PST 
“Every student reads a different book" 
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a. Every student reads a book that is different from the salient one in the 


discourse. EXTERNAL READING 
b. Every student reads a book that is different from the one any other 
student reads. INTERNAL READINGS 


In applying the diagnostic to Japanese exceptives, we find no contrast between con- 
nected and free exceptives: 


(36) a. 2 z Z FO £ < O E 2338: 2 AS Bú A, CG, 


Taroo-igai-no subete-no gakuseiga tigau hon-o 
T-except-GEN  all-GEN student-NOM different book-acc 
yon-da. 

read-PST 


“Every student except Taro reads a different book.” 
b Z ub Z < OMA MIE 2 À & DA 7 , 


Taroo-igai-wa subete-no gakuseiga tigau hon-o 
T-except-TOP  all-GEN student-NOM different book-acc 
yon-da. 

read-PST 


“Except Taro, every student reads a different book." 


Although the two readings seem clear native speakers of English and Japanese 
vary in discerning them, even with sufficient context provided. A cursory survey 
of several English and Japanese speakers suggests that some do not accept internal 
reading at all. Regarding Japanese, several speakers found (36a) and (36b) alike 
in that they both call only for external reading. Some speakers of both languages 
accepted the internal reading for both free and connected exceptives, including 
those contexts where the external reading was contextually ruled out. This result 
calls for closer scrutiny into the diagnostic and may invite future experimental 
Work on separating the external and internal readings regarding exceptives. 

We have identified several clear differences between free and connected 
exceptives in Japanese, which suggest that the former are clausal in nature. We 
have also identified areas of diagnostic uncertainty, which may highlight the weak- 
ness of certain diagnostics or the need for further study, including experimental 
investigations. Assuming Japanese free exceptives are clausal, the next question 
regards the way they are derived. We turn to this issue in the next section. 
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4 Derivation of Japanese free exceptives 


Section 3 argued that free exceptives in Japanese have clausal origins followed by 
an ellipsis, as sketched in (12b). To recapitulate, evidence in favor of this analysis 
comes from the availability of a full clause in free exceptives; multiple exceptions 
that do not form a constituent; non-nominal exceptions; separate binding domains; 
and the availability of clausal adverbs scoping exclusively over the exception. In 
this section, we explore the details of the Japanese derivation and compare it to the 
clausal analysis of English free exceptives (Potsdam and Polinsky 2019). We begin 
with discussing the categorial status of the exceptive marker igai. 


4.1 Categorial status of igai 


LA^ igai “outside,” along with the similarly structured DL) “inside,” was bor- 
rowed from the Chinese, possibly in the Han period. Both words are built on the 
verb LA (cf. Djamouri, Paul, and Whitman 2013). Martin (1975: 113) describes igai 
rather cryptically as a restrictive particle. Categorially, it could be a conjunction, 
a (relational) noun, or a postposition. The inventory of conjunctions in Japanese 
is quite slim, and, in any case, they do not co-occur with wa, which rules out that 
characterization. 

We already brought up parallels between exceptive and comparative construc- 
tions; the comparative marker in Japanese is characterized as a relational noun 
(Sudo 2015), which raises the possibility that igai is similarly a noun. However, igai 
cannot occur on its own, which is unexpected of nouns:? 


13 A reviewer notes that there is one context in which igai can occur alone, which is in an *echo" 
context, as in (i): 


@ A: 22 E, ROVE... 
Eeto Taroo-igai-wa... 
well T-except-TOP 
“Well, except Taro . . ." 


B: UM? 
Igai-wa? 
except-TOP 
“Except what?” 


For any other occurrences of igai, they must be accompanied by some complement that denotes 
an exception. 
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(37) a. "Dä? 
*Igai-wa? 
except-TOP 
(“What about others?”) 
b. (Bik? 
Hoka-wa? 
except-TOP 
“What about others?” 


Further, igai can combine with NPs, such as koto “thing,” without any linking 
material, as is typical of Japanese postpositions (e.g., Kuno 1973: 213-220): 


G8 KOVARSC EWP EOE THAW, 
Taroo-ga kuru-koto-igai-wa kii-te-nai. 
T-NOM come-koto-except-TOP hear-GER-NEG.PRS 
“I was not informed about anything except that Taro is coming.” 


Stacking is another characteristic typical of Japanese postpositions (Kuno 1973: 108- 
111; Shibatani 1977; Sadakane and Koizumi 1995), and igai can stack with other post- 
positions, as in example (21c), where it co-occurs with de. These considerations point 
to the status of igai as a postposition. Thus, it should combine with an NP, though we 
have already presented evidence that Japanese free exceptives contain a clausal layer. 
These findings can be reconciled by positing a nominal layer above the clausal layer. 


4.2 Evidence for the nominal layer in free exceptives 


A nominal layer above the clausal one is not unique to the exceptive constructions 
in Japanese; it has been proposed for comparatives (Sudo 2015 and references 
therein) and all kinds of temporal and conditional clauses (Kuno 1973; Tsujimura 
1992; Horie 1997). The initial evidence in favor of the external nominal layer above 
the clausal structure stems from examples such as (38), where the overt nominal 
koto appears. Additional evidence in favor of the nominal layer stems from the 
use of adnominal inflection in exceptives. Some predicates take different forms in 
finite (copular) and adnominal positions (cf. Miyagawa 1987), for example, 


(39) a. FPA YMETHMA(R/*S}. 
Dezain-ga totemo kanso{-da/*-na}. 
design-NOM very simple-COP/ADN 
*The design is very simple." 
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b Z uD < OE X3 Ai. 
totemo  kanso(*-da/-na) dezain-ga 
very simple-COP/ADN design-NOM 
“a very simple design” 


In exceptive constructions, only the adnominal form can be used, which indicates 
that an NP precedes igai even when it is not expressed overtly: 


(402) FRAY ME TEMA e/a} WIS XOEL ABAR, 
Dezain-ga totemo kanso{-da/*-na}-igai-wa 
design-NOM very simple-COP/ADN-except-TOP 
monku.no.tuke.dokoro-ga nai. 
place.to.complain.about-NOM NEG.PRS 
“Except for the design being very simple, there is nothing to complain about.” 


If this is on the right track, we can characterize igai uniformly as a postposition that 
combines with an NP. The head of that NP may (but does not have to) be spelled out 
(see Tsujimura 1992; Horie 1997 on the optionality of final heads in Japanese nomi- 
nalizations). In free exceptives, such an NP includes a nominalized complementizer 
phrase (CP), thus: [pp [np [cp. . . .] (koto)] igai]. 

A possible consideration against this proposal comes from the lack of the nom- 
inative-genitive conversion (NGC), also known as ga-no conversion: a phenome- 
non where the nominative and genitive of a subject can alternate in a prenominal 
clause (Harada 1971; Hiraiwa 2001; Maki and Uchibori 2008; Ochi 2017). Commonly 
observed in relative clauses, NGC is not available in exceptives: 


(41) Xa7{W/*O} FORE WA T2 UE t fJ 6 gii % 2 > Te 
[Taroo-ga/*-no sono hon-o yon-da]-igai(wa) 
T-NOM/-GEN that book-acc read-PsT-except- TOP 
daremo nanimo  yom-anakkat-ta. 
anyone anything read-NEG-PST 
*Except for Taro reading that book, no one read anything." 


However, it has been argued on independent grounds that first, relative clauses 
are TPs, not CPs (Murasugi 1991—but see Kaplan and Whitman 1995 for the CP 
analysis of Japanese relative clauses), and, second, NGC is available only in TPs 
(Hale 2002, Miyagawa 2013). On the assumption that exceptive clauses are CPs, we 
do not expect to find NGC in them (an alternative may appeal to the fact that the 
exception in (41) is a clause, thus the whole clause has been fronted to the exception 
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position, presumably spec,CP. In that position, the clausal subject is inaccessible 
for conversion which requires access to the subject from outside the CP). 


4.3 Analytical details 


Free exceptives in Japanese are derived via the attachment of the postpositional 
phrase headed by igai to a clause that expresses the generalization. To illustrate, 
we present the derivation for the following sentence, similar to (4b) above; in the 
schematics below, we use English glosses only. 


(42) a. EDL) Y ^COSLO f 233 7 , 
Hiro-ipaiCwa) subete-no otokonoko-ga ki-ta. 


H-except-TOP  all-GEN boy-NOM come-PST 
“Except Hiro, every boy came." 
b. TP 
ies ` Së 
PP TP 


EF e mudo 
NP P NP, TP, 
2 E igal í AN 


CP NP every boy tı T 
7 {fnul} 
NP; C VP T 
Hiro P "See, LR [PAST] 
<TP> C come 
Ge Ze 
t T 
VP T 
<. [past] 
come 


The antecedent clause in (42), every boy came, is TP,, and the associate of the excep- 
tion undergoes quantifier raising (although it is not clear whether it is a crucial part 
of an exceptive derivation). The exceptive phrase is a postpositional phrase (PP) 
adjoined to TP4, where the postposition igai selects an NP (with the null noun head 
in this case). This NP includes a CP, where the exception, Hiro, has moved to spec,C, 
and the remainder (TP;) undergoes deletion under identity with the antecedent 
clause TP,. The exceptive PP can also appear in a topic phrase (not shown in the 
derivation). As multiple topics are allowed in Japanese, free exceptives and clausal 
adverbials can appear in alternate orders: 
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(43 a. tn D NEHEHIE d ^ COSOT2XI. 
Hiro-igai-wa ` kinoo-wa subete-no otokonoko-ga  ki-ta. 
H-except-TOP  yesterday-TOP all-GEN boy-NOM come-PST 
b. MAME CUA q XN COROFARE , 
Kinoo-wa Hiro-igai-wa  subete-no otokonoko-ga  ki-ta. 
yesterday-TOP  H-except-TOP  all-GEN boy-NOM come-PST 
“Except Hiro, yesterday every boy came.” 


Positional alternations between free exceptives and other clause-peripheral mate- 
rial suggest that the occurrence in the first position of the left periphery is not a crit- 
ical property of Japanese free exceptives. Consider now the derivation of a clausal 
free exceptive in English (Potsdam and Polinsky 2019):"* 


(44) a. Every boy came, except Hiro. 


b. &P 
eg ` riese, 
TP & 
DE 
NP, TP, & CP 
ZN N exeptNEG N 
every boy t4 T: NP, C 
ZN 
T VP Hiro C  <TPẹ 
[PAsT] yous 
come t T 
P "e 
T VP 
[rast] AN 
come 


In English, except is a coordinating conjunction that heads an &P, coordinating the 
main clause Every boy came and the exceptive clause, except Hiro. The antecedent 
clause Every boy came is TP, and the associate of the exception undergoes quanti- 
fier raising (although it is not clear whether it is a crucial part of an exceptive der- 
ivation). The exceptive phrase comprises the exceptive marker and a clause, TPr, 
out of which the exception has moved. For concreteness, we show the exception 
moving to spec,CP. Finally, the exceptive clause, TP;, is deleted under identity with 
the antecedent clause, TP,. 


14 We represent the exceptive conjunction as including covert negation, allowing for the identity 
of polarity in the antecedent and elided clauses. Section 5.2 discusses issues of polarity in-depth. 
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If we now compare the derivation of Japanese free exceptives to that of English 
ones, headedness aside, the main differences lie in the nature of the exceptive 
marker (a postposition in Japanese, a coordinating conjunction in English) and the 
presence of the nominal layer above the exceptive clause CP (yes in Japanese, no 
in English). A reason for the difference between the two languages may lie in the 
impoverished inventory of Japanese conjunctions; in their absence, other means of 
clause linking can be used. 


5 Outstanding issues 


Assuming a PF deletion analysis in the derivation of free exceptives in Japanese, 
as in (42b), we face several outstanding issues, such as (i) the nature of the comple- 
mentizer in the CP embedded under igai, and (ii) issues of identity under ellipsis. 
We discuss them in sections 5.1 and 5.2. Other outstanding issues that arise outside 
of the ellipsis analysis have to do with silent associates in connected exceptives and 
the relation between exceptives and negation. 


5.1 Nature of the head in the embedded 
complementizer phrase 


We analyze the clause embedded under the nominalizing head in the igai-postposi- 
tional phrase as a CP for two reasons, both of them indirect. First, the exception, the 
remnant that survives clausal ellipsis, is arguably A-bar moved and contrastively 
focused. Such material appears in the CP area (e.g., Rizzi 1997; Erteschik-Shir 2007). 
However, the A-bar movement proposal is particularly hard to defend given the 
lack of clear island effects in Japanese (Fukui 2006; Lasnik and Saito 1992; Omaki 
et al. 2020; Richards 2000; Watanabe 2003), let alone the lack of overt wh-movement. 
Second, we contrasted Japanese exceptive clauses with relative clauses; the 
latter are, arguably, TPs in Japanese and allow for GNC. By that logic, the former 
are larger in structure, hence CPs. It would be desirable to identify other evi- 
dence in favor of the CP analysis. It is also important to understand the nature 
of the silent complementizer C that is present in the exceptive clause. This head 
attracts the expression of exception to its specifier Following Lobeck (1995) and 
Merchant (2001), we assume this head carries the feature [E], which licenses the 
non-pronunciation of its complement. Given that exceptions are not wh-words, the 
nature of the C head is unclear and remains an issue for future investigation. 
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A silent C has also been proposed in some clausal analyses of Japanese compar- 
atives (Bhatt and Takahashi 2011; see Sudo 2015 for the proposal that these clauses 
include an underlying relative clause only). It remains to be seen if the underlying 
Cin these clauses, which then undergo ellipsis, is the same or different in nature. 


5.2 Identity under ellipsis 


Since the earliest studies on ellipsis, a recurring issue has been the form of the 
identity requirement that must hold between an elided element and its antecedent 
(see Lipták 2015 and Ranero 2021 for a summary and references). Early analy- 
ses (Chomsky 1964, 1965; Ross 1967; Sag 1976; Williams 1977) required strict syn- 
tactic identity, while later ones turned to a purely semantic identity requirement 
(Dalrymple, Shieber, and Pereira 1991; Hardt 1993, 1999; Merchant 2001). Recent 
work has returned to a purely syntactic account or a mixed account in which both 
semantic and some amount of syntactic identity is required (Chung, Ladusaw, and 
McCloskey 2011; Merchant 2013; Lipták 2015; Barros and Vicente 2016; Thoms 
2015; Ranero 2021). 

In exceptives, the issue of identity arises regarding polarity mismatch. Excep- 
tives require that the elided clause and the antecedent have opposite polarity, as in 
(45). It can be seen in the interpretation of the exceptives in (46), where the polari- 
ties of the overt and elided clauses are opposite. 


(45) Polarity Generalization (following García Álvarez 2008: 129) 
The proposition expressed in the main clause and exceptive clause must have 
opposite polarity. 


(46) a. Every student succeeded, except Bill didn’t-suceeed. 
b. Ididn't see anyone, except Bill Fsaw. 


Three possible solutions emerge, and we will sketch them out briefly. Assuming 
syntactic identity on ellipsis, the polarity reversal may be only apparent, and the 
exceptive phrase contains a possibly covert instance of negation that triggers the 
reversal, for example, embedded in the meaning of the exceptive marker (Potsdam 
2019; Soltan 2016). In some languages, such as Malagasy, the negative component 
ofthe exceptive marker is morphologically overt (Potsdam 2019). In this approach, 
the negation is not actually inside the ellipsis site and there is no polarity mismatch. 
If so, (47a) is analyzed along the lines of (47b); we represented such negation in the 
structure of the English example in (44b). 
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(47) a. Every student succeeded, except Bill. 
b. Every student succeeded, AND.NOT Bill sueeeeded. 


Extending this idea to Japanese, the lexical specification of igai includes negation, 
making it similar to a caritive postposition (*without"). A possible consideration 
against this approach has to do with the non-polarity reversing (additive) meaning 
of igai, asillustrated in (3); it has two different meanings. It is still possible to imagine 
two different lexical items, one with negation in it ("apart from; not included in") 
and the other without one (the additive marker), but it is striking that such co-oc- 
currence of meanings is cross-linguistically common, hence non-accidental (Zuber 
1998; Sevi 2008; Vostrikova 2019). 

Another way of tackling polarity mismatches while maintaining syntactic iden- 
tity relies on featural (under)specification (Ranero 2021). The main constraint on 
identity is realized via the presence (absence) of features. However, instead of a 
simple featural identity, the syntactic condition on the ellipsis relies on features 
being non-distinct. For example, a privative feature present in the antecedent but 
not in the ellipsis site (or vice-versa) does not constitute a violation of identity. Nor 
is a functional projection present in one but not in the other. 

In this approach, clauses containing negation project a >P phrase where the 
head X hosts a [NEG] feature (Laka 1990, 1991). Conversely, XP is absent in affirm- 
ative clauses (Laka 1990, 1991). Adopting this analysis, exceptives involve a mis- 
match between the absence and presence of a head bearing a feature bundle; in 
this case, Zwee, The affirmative clause is featurally empty regarding ).vrc}, hence 
no feature clash is observed, and an ellipsis is possible (modified from Ranero 
2021: 188): 


(48) Antecedent: [xp . . . YP] no X? 
Ellipsis site: [sp [xp te YP]] CH [*NEG] 


Finally, another strand of explanation for the Polarity Generalization is that such 
mismatches are generally allowed in clausal ellipsis, and syntactic conditions on 
ellipsis are just too restrictive. Kroll (2019) documents several sluicing contexts 
in which the sluiced clause and its antecedent mismatch in polarity. In (49), the 
antecedent is positive, while the sluiced clause is negative. 


(49) Either the Board grants the license by December 15 or it explains why it didn’t 
grantthelicense-by-December-5. (Kroll 2019: 25) 


Kroll (2019, 2020) offers a discourse-pragmatic analysis of the identity condition 
in the clausal ellipsis that allow for such mismatches. However, it remains to be 
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seen how to save this approach from overgeneration whereby more mismatches 
may be allowed than actually possible. Identity conditions on deletion in clausal 
exceptives may not be uniform for all exceptive clauses. For instance, the (covert) 
negation approach may work for exceptive markers that do not have the additive 
reading, and the featural non-distinctness may be more applicable to structures 
with markers such as the Japanese igai or English besides. We leave the choice of a 
specific approach to identity for further research. 


5.3 Missing associate 


In Section 2, we already introduced a possible challenge concerning the contrast 
between connected and free exceptives regarding the implicit nature of the asso- 
ciate. Based on English, several researchers propose that the associate can only 
be implicit in free exceptives (presumably regardless of their phrasal or clausal 
derivation). 

The situation in Japanese is more complicated. First, only the left periphery is 
available for exceptive placement, and as discussed in Section 4.3, optional scram- 
bling of free exceptives is also possible. Thus, this diagnostic in and of itself is not 
very strong. Second, case markers, the topic marker wa, and the linker no can be 
dropped under several conditions (Kuno 1973; Fry 2003; Fujii and Ono 2000). Hence, 
the status of the exception expression is not always clear. It is further confounded 
by some graded judgments we will review below. We start by reviewing some of the 
examples with an unexpressed associate. 


(50) ZO J — hk 4 Z u 2 2 X 2, 
Sono dezaato-wa Taroo-igaiga taberu. 
this dessert-TOP T-except-NOM  eat.PRS 
“Everybody except Taro eats this dessert.” 


61) Zn 710 U YIU (A) EXE. 
Taroo-wa ringo-igai(-o) tabe-ta. 
T-top apple-except-ACC eat-PST 
“Taro ate everything except the apple.” 


The two examples show exception phrases in the nominative and accusative, 
respectively. It is independently established that the topic marker -wa cannot 
immediately follow case markers (Watanabe 2009); that is, a case marker and the 
topic marker cannot co-occur: 
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(52) a Zn 70 y TUA E ld) AXE. 


Taroo-wa  ringo-igai-o-(*wa) tabe-ta. 
T-TOP apple-except-ACC-TOP eat-PST 
b zou YAW) ët, 
Taroo-wa  ringo-igai-(*o-)wa tabe-ta. 
T-TOP apple-except-ACC-TOP — eat-PST 


*Taro ate everything but the apple." 


Given the scrambling options discussed earlier we can identify (52b) as an 
instance of a free exceptive with an implicit associate, an option widely attested 
in free exceptives. Though we do not have instrumental measures to support it, 
the prosody of (52b) includes breaks after each topic-marked phrase, and the pitch 
after the exception expression does not go down, which accords with observations 
on the prosody of topic expressions in Japanese (Nakanishi 2001). Meanwhile (52a) 
does not include a prosodic break after the object and there is no pitch reset. An 
instrumental investigation of prosodic differences between examples such as (52a) 
and (52b) is called for. However, for now, we would like to propose that (52a) is 
an instance of a connected exceptive with a silent (null-pronominal) associate, 
whereas (52b) is a genuine free exceptive. As such, the two examples reflect two 
distinct types of *missing" associates. Given that the associate in the connected 
exceptive is expressed as a null pronominal, the linker no is deleted and the case 
marker directly follows igai. 


(53) [[ringo-igai-ne] pro]-o 
apple-except-GEN  pro-Acc 


If this analysis is correct, we can also predict that postpositions, as with case 
markers, can follow igai in connected exceptives with the null associate. This pre- 
diction is confirmed: 


64 AXuvik^T-a2Ubh»5*z»a2v—hté&tótb5oi. 
Taroo-wa Hanako-igai(-pro)-kara_ chokoleetto-o moratta. 
T-top H-except-from chocolate-ACC — receive.PST 
*Taro received chocolate from everyone except from Hanako." 


Unlike case-marked exceptives, where the order “case-marker-before-igai” is 
simply unavailable, postpositions can appear either after the exceptive marker, as 
in (54), or before it: 
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(55) AXu7ik^»T*auUm»5T7aaguv—hktót5oli. 
Taroo-wa  Hanako-kara-igai chokoleetto-o moratta. 
T-top H-from-except chocolate-ACC receive.PST 
*Taro received chocolate from everyone except from Hanako." 


The difference, as we contend, again boils down to the difference between con- 
nected and free exceptives; in (54), there is a null-pronominal associate in a con- 
nected exceptive, marked off by the postposition. In (55), the postposition igai 
stacks on the postposition kara forming an exceptive phrase. Table 3 summarizes 
the distributional properties of Japanese exceptives with a missing associate. The 
linear order of the exceptive marker and postpositions or case markers partially 
resolves the structural ambiguity in the two types of associates. 


Table 3: Japanese exceptives with unexpressed associate. 


Free exceptive with implicit associate Connected exceptive with null associate 


Case marker impossible follows igai 
Postposition ` precedes igai follows igai 


The next question that arises has to do with the licensing conditions on null asso- 
ciates in the connected exceptive. Null associates in exceptive phrases have been 
reported for other languages, Arabic in particular (Al-Bataineh 2021), but, crucially, 
in Arabic, the null associate is licensed by negation. In Japanese, as shown by the 
examples, null associates can also be licensed in affirmative clauses. 

Another outstanding issue raised by the data regards language processing. 
Given structural ambiguity between free exceptives with implicit associates and 
connected exceptives with null associates, how is this ambiguity reflected in real- 
time? This question could inform a future experimental study where the two orders 
of postposition and igai, such as (54) and (55), could be compared systematically. 


15 The marker ni has been subject to much discussion in the literature on Japanese, with ongo- 
ing debates about its status as a case marker or a postposition (e.g., Sadakane & Koizumi 1995). 
Its distribution in exceptives can be used as an additional argument in favor of its status as a 
postposition, as it can precede or follow igai. 
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6 Exceptive or exceptive impostor? 


The discussion thus far has been limited to igai, but other particles in Japanese have 
been claimed to express an exceptive meaning, in particular, the focus particles 
dake and shika, both of which correspond to the English “only” or “just.” Both have 
been traditionally analyzed as focus particles denoting exclusion, hence the paral- 
lels with the English only. 

Researchers seem to converge on the conception that dake should be analyzed 
as a general focus particle (see Futagi 2004 and references therein). Further, dake 
can combine with shika and igai, which also suggests that its function is different 
from that of the exceptive marker. We can, therefore, set dake aside as a general- 
ized focus particle whose meaning of exclusion arises via inference. As for shika, 
things are a bit more complicated. One of the key properties that distinguish shika 
from dake is its sensitivity to polarity. That is, shika requires a clause-mate negative 
(suffix) na(kat) as its licensor, as in (56). 


(560 a. Z 7 L 2>53KZ 2 oO K.o 
Taroo-shika — ko-nakat-ta. 
T-only come-NEG-PST 
*Only Taro came." 
b. *XO7L MAK. 
Taroo-shika _ ki-ta. 
T-only come-PST 


However, as in the English paraphrase, we see no semantic input of this negation 
in the resulting sentence meaning: despite being a negative suffix on the verb, 
(56a) roughly has the same meaning as exceptive examples without negation. 
It raises the question as to how the meaning of a sentence containing shika is 
derived compositionally, and, further, whether the traditional assumption that 
shika is an exclusive particle should be maintained. We address these questions by 
comparing the semantic properties of shika and igai. 

In comparing shika and igai, let us start with similarities, which have to do 
with the ability to antecede coreferential pronouns. To illustrate, the examples 
below, adapted from Kuno (1999), describe the same situation: nobody except Taro 
was wearing a seatbelt, which is why only Taro survived. When Taro is marked 
with the particle dake, the null pronoun in the following sentence cannot pick out 
the other individuals that are part of the exclusive meaning (i.e., it cannot mean 
“they”), as shown in (57b). It is consistent with the status of dake as a regular focus 
particle. However, when Taro appears with either shika or igai, the null pronoun 
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in the subsequent sentence cannot pick out Taro as its referent. Thus, its referent is 
restricted to “they,” cf. (58a) and (59a). 


(57) a. 


(58) a. 


(59) a. 


Zu7U2% BJ2 o, Yr RNVbhELTORDSE. 
Taroo-dake-ga tasukat-ta. pro siitoberuto-o  si-tei-ta-kara-da. 
T-only-NOM survive-PST seatbelt-ACC ` wear-GER-PST-COP 
“Only Taro survived. That's because he was wearing a seatbelt.” 
AnujiuUumHWSmolf.nr—btiWwvhtkeUCUGmÉhojm5fms. 
Taroo-dake-ga  tasukat-cta.  Zpro  siitoberuto-o 

T-only-NOM survive-PST seatbelt-acc 
si-tei-nakat-ta-kara-da. 

Wear-GER-NEG-PST-COP 

“Only Taro survived. That's because they were not wearing a seatbelt.” 


RaAVLPMMPS5 DoR YORNVEELTORMDAR. 
Taroo-shika tasukara-anakat-ta. ##pro ` siitoberuto-o 

T-only Survive-NEG-PST seatbelt-acc 
si-tei-ta-kara-da. 

Wear-GER-PST-COP 

“Only Taro survived. That's because he was wearing a seatbelt.” 
RAOVLAWDS Swot. Y-bERVEELTRRDODEDSE. 
Taroo-shika tasukara-anakat-ta. pro siitoberuto-o 

T-only survive-NEG-PST seatbelt-acc 
si-tei-nakat-ta-kara-da. 

Wear-GER-NEG-PST-COP 

*Only Taro survived. That's because they were not wearing a seatbelt." 


AuvyUNEmemaeoi.nL—hk^wvhk&Uctuofb57. 
Tarooigai  tasukara-anakat-ta. ##pro  siitoberuto-o 

T-except Survive-NEG-PST seatbelt-Acc 
Si-tei-ta-kara-da. 

Wear-GER-PST-COP 

‘Only Taro survived. #That’s because he was wearing a seatbelt.s 
Au7UMaSAESPol.»nr—bkWwwvhktULCUOZmolim5 
I. 

Tarooigai tasukara-anakat-ta. pro siitoberuto-o 

T-except Survive-NEG-PST seatbelt-Acc 
si-tei-nakat-ta-kara-da. 

Wear-GER-NEG-PST-COP 

*Only Taro survived. That's because they were not wearing a seatbelt." 
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This difference in the possible referent of the null pronoun suggests that the 
dake-sentence in (57) regards Taro, while the shika-sentence and the igai-sentence 
regard the associate, not the exception. This notion is what motivates an analysis 
under which shika, just as igai, is analyzed as an exceptive marker. For example, 
Yoshimura (2007) proposes a universal exceptive marker analysis of shika: she con- 
tends that shika is an exceptive marker whose semantic representation includes a 
universal quantifier. Hence, under her analysis, Only Taro survived is not an accu- 
rate paraphrase of (56a). Instead, it should be paraphrased as Everyone except Taro 
did not survive. Now the meaning of (56a) can be derived compositionally, as the 
semantic input of negation is evident in its interpretation (did not survive for the 
non-exceptions vs. survived for the exception). 

However, several significant differences separate shika and igai, which cast 
doubt on the view that shika is an exceptive marker. As discussed, shika is polar- 
ity-sensitive and requires a clause-mate negative suffix na(kat) as its licensor. 
Meanwhile, Hasegawa (2010) observes that the negation licensing shika does not 
behave in the same way as ordinary negation. As shown below, the negation that 
co-occurs with shika cannot license the negative polarity item (NPI) nanimo, (60). 
It differs from the negation that co-occurs with dake and igai, which can license an 
NPI, as in (61) and (62). 


(60 *X2 u 7 L >pfijt RR EDk. 
*Taroo-shika nanimo tabe-nakat-ta. 
*T-shika anything eat-NEG-PST 


(601) Z2=u2Z2Z0ÚhDPI + @<XZ2> Z. 
Taroo-dake | nanimo  tabe-nakat-ta. 
T-only anything  eat-NEG-PST 
*Only Taro didn't eat anything." 


(62) Xu 7 UM GENES. 
Tarooigai nanimo tabe-nakat-ta. 
T-except anything  eat-NEG-PST 
“Except Taro, nobody ate anything.” (lit.:... everyone did not eat anything) 


16 However, exceptives marked by igai can occur with or without negation, and exceptives of this 
type are more common in affirmative clauses, something that may be lost in discussion of exceptive 
constructions in theoretical papers. In corpus counts based on the 1,000,000 sentence train-1 por- 
tion of the corpus ASPEC, approximately 88% of igai-exceptives are found in affirmative clauses. 
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Additionally, Hasegawa notes that, while the exceptive meaning that Taro came 
is cancelable in (64), the same information introduced by shika in (63) is not. It 
suggests that the exceptive meanings that shika and igai contribute are of different 
types (entailment/presupposition and implicature respectively; see also Ido and 
Kubota 2021). 


(63) Z2ü 2 LA 22 o J L. Z= tabo. 
#Taroo-shika ko-nakkat-ta-shi, ^ Taroo-mo ko-nakkat-ta. 
T-shika come-NEG-PST-and  T-also come-NEG-PST 
*Only Taro came, and Taro also didn't come." 


(6 AXmnu7U5MEZAPAUL.XuvtXxZzmPof. 
Taroo-gai  ko-nakkat-ta-shi, Taroo-mo ko-nakkat-ta. 
T-except ` come-NEG-PST-and  T-also come-NEG-PST 
*No one other than Taro came, and Taro also didn't come." 


For these reasons, Hasegawa concludes that shika is not an exceptive marker, 
arguing in favor of the traditional view that shika is an exclusive particle. Follow- 
ing this conclusion, we also assume that shika is an exclusive particle, while igai is 
a genuine exceptive marker. 


7 Conclusions 


This chapter began by introducing exceptives as constructions that express exclusion. 
Thus, they comprise an exceptive phrase, which excludes the exception from the 
domain of an associate. 


(65) Everyone laughed [except Mary] 
ASSOCIATE EXCEPTIVE MARKER EXCEPTION 
[... EXCEPTIVE PHRASE ... ] 


We presented and analyzed the expression of exception in Japanese, formally 
marked with the postposition igai. As a postposition, igai combines with an NP. The 
internal structure of that NP can be quite complex; in particular, it can include 
a nominalized CP. Japanese allows for connected and exceptives, which differ by 
whether the exception and the associate form a constituent (yes for the former, 
no for the latter). We have shown that Japanese free exceptives always include 
underlying nominalized CPs (sometimes headed by a null nominal head), with 
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elided material. This kind of ellipsis is different from clausal ellipsis in exceptives 
in languages like English, where no nominal or determiner head is attested. Until 
now, only two types of free exceptives have been recognized: non-clausal phrasal 
ones (unattested so far) and clausal (with ellipsis), as in English or Egyptian Arabic 
(Soltan 2016). Thus, the novel Japanese results enrich the existing typology of excep- 
tive constructions by recognizing a nominalized CP as another source of exceptive 
constructions. 

Among other languages whose exceptives have been studied, Japanese also 
stands out as the only language thus far where both free and connected exceptives 
can have a null associate, which does not have to be licensed by negation. On the 
one hand, given the proliferation of null nominals in Japanese, it is not unexpected 
that null associates in Japanese exceptives are readily available. However, the exact 
licensing conditions on these null expressions are not yet properly understood. 

Finally, Japanese contributes novel data to the observation that the original 
constraint on universal quantifiers in the associate of an exceptive is too strong. 
García Álvarez (2008: 13-21) and Galal (2019) have already called it into question 
based on English data, and Japanese serves as another reminder that more seman- 
tic work is needed to understand the nature of the domain of generalization in 
exceptives. 

While the main focus has been on the exceptive constructions with the postpo- 
sition igai, which we consider a dedicated exceptive marker, we have also discussed 
the expression of exclusion with the particles dake and shika. Although these par- 
ticles can mark off exclusion to a generalization, this appears to be a side effect 
of their semantics, not their dedicated function. Thus, they are not exclusive to 
exceptive constructions. 
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